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PREFACE 


Almost  six  and  half  years  have  passed  since  I  started  this  research  on  June  16,  1981,  and  these 
were  hectic  years.  During  this  period,  so  many  things  were  designed  and  accomplished.  Even  if  I  am 
the  principal  investigator,  I  find  it  practically  impossible  to  include  and  systematize  all  the  important 
findings  and  implications  within  a  single  final  report,  and  it  is  my  regret  that  so  many  of  them  have 
to  be  left  out.  I  did  my  best  within  a  limited  amount  of  time,  however,  with  the  hope  that  this  final 
report  will  help  the  reader  to  grasp  the  outline  of  the  whole  accomplishment. 

There  were  six  main  objectives  in  the  original  research  proposal,  and  they  can  be  summarized  as 
follows. 

[lj  Investigation  of  theory  and  method  for  estimating  the  operating  characteristics  of  dis¬ 
crete  item  responses,  which  include  the  plausibility  functions  of  the  distractors  of  tjie 
multiple-choice  test  item,  as  well  as  the  graded  item  responses  of  the  fppe-«response  test 
item,  without  assuming  any  specific  mathematical  formi;  and  without  using  too  many 
examinees  in  the  whole  procedure.  ,  '  ‘  *  ' .  *  ' 

[2]  Investigation  of  the  various  characteristics  of  the  new  family  of  models  f<jrtjbeoi»»ltiple-clioice 

test  item,  both  in  theory  and  in  practice.  ' 

/  «  ! 

[3]  Production  and  revision  of  a  set  of  systematic  procedures  for  applying  some  combina¬ 
tions  of  a  method  and  an  approach  for  estimating  'he  operating  characteristics  of  discrete 
item  responses,  by  modifying  and  reorganizing  all  the  computer  programs  written  for  this 
purpose. 

[4]  Development  of  latent  trait  theory  further,  and  include  more  varieties  of  situations. 

[5]  Investigation  of  ways  of  bridging  across  mathematical  psychology  and  cognitive  psychology, 
through  latent  trait  theory.  - 

[6]  Systematizing  theories  and  methods  to  eventually  lead  to  a  good  introductory  book  on 
latent  trait  theory  and  other  publications.  .  . 

Out  of  these  objectives,  Objective  [l]  and  [4],  together  with  Objectives  [3]  and  [5],  were  most  intensively 
pursued.  The  highest  productivity  belongs  to  this  part  of  the  research.  It  provided  us  with  valuable 
future  perspectives  of  research.  Objective  [2]  was  also  successfully  pursued.  In  contrast  to  them, 
Objective  [6]  was  more  or  less  dropped.  To  compensate  for  it,  however,  some  extensive  research  was 
done  concerning  the  three-parameter  logistic  model.  The  main  reason  for  this  was  because  Navy  had 
adopted  the  model  for  its  computerized  adaptive  testing,  and  there  was  a  necessity  to  pursue  it. 

It  was  my  satisfaction  and  pleasure  that  Advanced  Seminar  on  Latent  TVait  Theory  was  planned 
and  held  during  this  research  period,  and  also  that  I  had  opportunities  of  introducing  the  research  at 
international  conferences  as  well  as  at  domestic  ones. 

During  the  research  period  there  were  so  many  people  who  helped  me  as  assistants,  secretaries,  etc., 
as  I  acknowledged  in  each  research  report.  Also  people  of  the  Office  of  Naval  Research,  especially  Dr. 
Charles  E.  Davis,  and  the  people  of  the  ONR  Atlanta  Office  including  Mr.  Thomas  Bryant  and  Mr. 


Donald  Calder,  have  been  of  great  help  in  conducting  the  research.  I  would  like  to  express  my  gratitude 
to  all  these  people. 


Thanks  are  also  due  to  my  two  assistants,  Christine  A.  Golik  and  Ali  Khaddouma,  and  secretary, 
Betty  Jo  Allen,  who  helped  me  in  preparing  this  final  report.  Appreciation  is  also  extended  to  my 
former  assistant  Philip  S.  Livingston  who  still  helped  me  occasionally  during  the  research  period. 


Accession  Fop 

NTI3  GRAAI 
DTIC  TAB 
Unannounced  □ 

Just  If  loation . . 


9y - 

Distribution/ 


Availability  Codes 


Dlst 


|Ava7l  and/or 

Spool*! 


December  25,  1987 
Author 


TABLE  OF  CONTENTS 


I  Introduction 


I.l  Research  Reports 


1.2  Advanced  Seminar  on  Latent  TYait  Theory 


1.3  Invited  Conference  Addresses 


1.4  Paper  Presentations  at  National  and  International  Conferences 


1.5  Book  Chapters 


1.6  Other  Events 


II  Theory  and  Methods  for  Estimating  the  Operating 
Characteristics  of  Discrete  Item  Responses 


II.  1  Conditional  P.D.F.  Approach  Combined  with  the  Normal 
Approach  Method 


II. 2  Lognormal  Approach  Method 


II. 3  Discussion 


III  Bias  Function  of  the  Maximum  Likelihood  Estimate 
of  Ability  in  the  General  Discrete  Response  Case 


III.l  Background 


III.2  Rationale 


III.3  Bias  Function  and  Amount  of  Test  Information 


III.4  Increment  in  Bias  Caused  by  Random  Guessing 


III.5  Adaptive  Testing 


III.6  Discussion 


IV  Constancy  in  Item  Information  and  the  Information 
Loss  Caused  by  Noise  on  the  Dichotomous  Response 
Level 


IV.  1  Four  Types  of  Models  for  Dichotomous  Test  Items 


IV. 2  Information  Loss 


IV. 3  Basic  Functions  and  Item  Response  Information  Functions 


IV.4  Three-Parameter  Logistic  Model 


IV. 5  Loss  in  Speed  of  Convergence  of  the  Conditional  Distribution  of  the 
Maximum  Likelihood  Estimate  to  the  Normality 


IV. 6  Discussion 


Y  A  Latent  Trait  Model  for  Differential  Strategies  in 

Cognitive  Processes  36 

V.l  Rationale  36 

V.2  Differential  Strategy  TYees  37 

V.3  A  General  Model  for  Differential  Strategies  37 

V.4  Homogeneous  Case  43 

V.5  Single  Strategy  Case  46 

V.6  Information  Provided  by  Differential  Strategies  47 

V.7  Discussion  48 

VI  Latent  Trait  Models  for  Partially  Continuous 

and  Partially  Discrete  Responses  49 

VI.  1  Rationale  49 

VI. 2  Conditional  Distribution  of  the  Item  Score  49 

VI. 3  Parametric  and  Nonparametric  Estimations  of  Operating  Density  Characteristics  56 

VI. 4  Estimation  of  the  Individual  Parameters  of  Subjects  57 

VI.5  Prospect  of  Adopting  These  Models  for  Rorschach  Diagnosis  63 

VI. 6  Discussion  66 

VII  Informative  Distractors  and  Their  Plausibility 

Functions  in  the  Multiple-Choice  Test  Items  67 

VII.  1  Iowa  Test  Data  68 

VII.2  Method  68 

VII. 3  Results  69 

VII.  4  Discussion  76 

VIII  Analysis  of  Shiba’s  Data  Collected  upon  His  Word/Phrase 
Comprehension  Tests:  Comparison  of  Tetrachoric  Method 

and  Logist  5  on  Empirical  Data  78 

VIII.  1  Shiba’s  Data  78 

VIII. 2  Results  79 


VIII.3  Discussion 


91 


99 


IX  Item  Parameter  Estimation  Using  Logist  5  on  Simulated  Data 


IX.  1  Simulated  Data  99 

IX.2  Method  100 

IX. 3  Results  100 

IX. 4  Discrimination  Shrinkage  Factor  and  Difficulty  Reduction  Index  120 

IX.5  Discussion  123 

X  Discussion  and  Conclusions  124 


Appendix  A:  Contents  of  the  Advanced 

Seminar  on  Latent  TVait  Theory  (1982) 

Appendix  B:  Overview  of  Latent  Trait  Models:  Paper 
Presented  at  the  Fifteenth  Annual 
Meeting  of  the  Behaviormetric  Society  of  Japan 

Appendix  C:  Ten  Items  of  the  Rorschach  Test  and 
Their  Scorings  for  the  Purpose  of 
Measuring  the  Intellectual  Capability 
Using  Appropriate  Latent  Trait  Models 


125 


129 


135 


I  Introduction 


This  is  the  final  report  of  the  multi-year  research  project,  entitled  "Advancement  of  Latent  Trait 
Theory”,  which  was  sponsored  by  the  Office  of  Naval  Research  in  1981  through  1987  (N00014-81-C- 
0569).  The  first  half  of  the  research  project  was  conducted  under  the  above  title  in  June,  1981  through 
June,  1984.  Right  after  this,  in  June,  1984,  through,  December,  1987,  the  second  half  of  the  research 
continued  under  the  title  "Advancement  of  Latent  Trait  Theory  IF.  Since  the  objectives  of  the  two 
proposed  research  projects  are  parallel,  i.e,  those  of  the  second  half  are  to  pursue  the  objectives  of 
the  first  half  further  and  in  more  detail,  the  present  final  report  will  treat  them  as  one  long  research 
project,  and  the  accomplishments  of  these  two  projects  will  be  integrated  and  systemized  together. 
These  accomplisments  include  those  which  have  already  been  published  as  ONR  research  reports  as 
well  as  those  still  in  progress,  which  will  be  published  in  later  years  as  part  of  more  comprehensive 
research  results. 

The  rest  of  this  chapter  will  describe  papers  published  or  presented  during  the  research  period,  and 
related  events.  The  contents  of  the  research  accomplishments  will  be  summarized  and  systematized, 
and  will  be  described  in  the  succeeding  chapters. 

[I.l]  Research  Reports 

The  following  are  the  ONR  research  reports  that  have  been  published  in  the  present  research  project. 

(1)  Information  loss  caused  by  noise  in  models  for  dichotomous  items.  Office  of  Naval  Research 
Report  82-1,  1982. 

(2)  Effect  of  Noise  in  the  Three-Parameter  Logistic  Model.  Office  of  Naval  Research  Report 
82-2,  1982. 

(3)  A  Latent  Trait  Model  for  Differential  Strategies  in  Cognitive  Processes.  Office  of  Naval 
Research  Report  83-1,  1983. 

(4)  Information  functions  for  the  general  model  developed  for  differential  strategies  in  cognitive 
processes.  Office  of  Naval  Research  Report  83-2. 

(5)  A  general  model  for  the  homogeneous  case  of  the  continuous  response.  Office  of  Naval 
Research  Report  83-3,  1983. 

(6)  Plausibility  functions  of  Iowa  Vocabulary  Test  items  estimated  by  the  Simple  Sum  Proce¬ 
dure  of  the  Conditional  P.D.F.  Approach.  Office  of  Naval  Research  Report  84-1,  1984. 

(7)  Comparison  of  the  estimated  item  parameters  of  Shiba’s  Word/Phrase  Comprehension 
Tests  obtained  by  LOGIST  5  and  those  by  the  tetrachoric  method.  Office  of  Naval  Research 
Report  84-2,  1984. 

(8)  Results  of  item  parameter  estimation  using  Logist  5  on  simulated  data.  Office  of  Naval 
Research  Report  84-3,  1984. 

(9)  Bias  function  of  the  maximum  likelihood  estimate  of  ability  for  discrete  item  responses. 
Office  of  Naval  Research  Report  87-1,  1987. 

(10)  Final  Report:  Advancement  of  latent  trait  theory.  Office  of  Naval  Research  Final  Report, 
1988. 


[1.2]  Advanced  Seminar  on  Latent  Trait  Theory 

As  was  proposed  in  the  present  research,  advanced  Seminar  on  Latent  Trait  Theory  was  held  at  the 
Sheraton  Gatlinburg  Hotel,  Gatlinburg,  Tennessee,  for  four  days  on  March  30  through  April  2,  1982. 

Lectures  were  given  to  approximately  forty  researchers  and  graduate  students  from  all  over  the 
country,  with  the  principal  investigator  as  the  sole  speaker.  The  contents  of  the  lectures  were  taken, 
mainly,  from  her  work  of  the  past  five  years  on  various  topics  in  Latent  Trait  Theory,  including  more 
general  topics  such  as  the  method  of  moments  as  the  least  squares  solution  for  fitting  polynomials,  etc. 
The  topics  and  contents  of  this  Advanced  Seminar  are  given  in  appendix  A. 

The  computer  package  programs  were  also  introduced  to  the  participants  of  the  seminar.  These 
seven  package  programs  are  as  the  following. 


(1)  TAU  TRANSFORMATION:  The  process  of  transforming  the  original  latent  trait  9  to 
r  ,  which  provides  the  Old  Test  with  a  constant  amount  of  test  information.  Old  Test  is 
a  set  of  test  items  whose  operating  characteristics  are  known  and  in  this  case  they  follow 
the  normal  ogive  model.  The  operating  characteristics  of  "unknown”  test  items  are  to  be 
estimated,  depending  partially  upon  the  information  provided  by  the  Old  Test. 

(2)  MLE  THETA:  The  process  of  obtaining  the  maximum  likelihood  estimate  of  9  for  each 
individual  examinee  from  his  or  her  response  pattern  on  the  Old  Test. 

(3)  CONDITIONAL  MOMENTS  MLE:  The  process  of  estimating  the  conditional  moments 
of  r  ,  given  its  maximum  likelihood  estimate  f  ,  which  is  transformed  from  9  ,  i.e.,  the 
maximum  likelihood  estimate  of  9  .It  also  includes  in  the  process  the  approximation  of 
the  density  function  of  f  using  the  method  of  moments  to  fit  a  polynomial  to  the  set  of 
observations. 

(4)  SIMPLE/WEIGHTED  SUM  NT:  Simple  Sum  Procedure  and  Weighted  Sum  Procedure  of 
the  Conditional  P.D.F.  Approach  combined  either  with  the  Normal  Approach  Method  or 
with  the  Two-Parameter  Beta  Method,  to  produce  the  estimated  operating  characteristics 
of  the  discrete  item  responses  of  the  "unknown”  test  items. 

(5)  PROPORTIONED  SUM  NT:  Proportioned  Sum  Procedure  of  the  Conditional  P.D.F.  Ap¬ 
proach  combined  either  with  the  Normal  Approach  Method  or  with  the  Two-Parameter 
Beta  Method,  to  produce  the  estimated  operating  characteristics  of  the  discrete  item  re¬ 
sponses  of  the  "unknown”  test  items. 

(6)  CONDITIONAL  MOMENTS  SUBGROUP:  The  process  of  estimating  the  conditional  mo¬ 
ments  of  r  ,  given  its  maximum  likelihood  estimate  t  ,  which  is  transformed  from  9  , 

i.e.,  the  maximum  likelihood  estimate  of  9  ,  for  each  discrete  item  response  subgroup 
of  each  "unknown”  test  item.  It  also  includes  the  approximation  of  the  density  function 
of  f  for  each  subgroup  using  the  method  of  moments  to  fit  a  polynomial  to  the  set  of 
observations. 

(7)  BIVARIATE  P.D.F.  NT:  Bivariate  P.D.F.  Approach  combined  either  with  the  Normal 
Approach  Method  or  with  the  Two-Parameter  Beta  Method,  to  produce  the  operating 
characteristics  of  the  discrete  item  responses  of  the  "unknown"  test  items. 

Judging  from  the  participants’  reactions  during  and  after  the  Seminar,  it  is  believed  that  the  Seminar 
gave  them  a  good  grasp  of  the  new  theoretical  and  methodological  developments  made  by  the  principal 
investigator,  i.e.,  sin  accomplishment  toward  the  goal  of  the  present  research,  Advancement  of  Latent 
Trait  Theory. 
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[1.3]  Invited  Conference  Addresses 

During  this  period,  there  were  two  invited  conference  paper  presentations  and  one  special  lecture 
introducing  some  of  the  accomplishments  of  the  principal  investigator’s  research.  They  are  as  follows: 

(1)  Some  methods  and  approaches  of  estimating  the  operating  characteristics  of  discrete  item 
responses.  Dr.  Frederic  M.  Lord’s  Festchrift  Conference  to  Celebrate  His  Seventieth  Birth¬ 
day,  Educational  Testing  Service,  1982,  Princeton,  New  Jersey,  U.  S.  A. 

(2)  Development  and  application  of  methods  for  estimating  operating  characteristics  of  dis¬ 
crete  item  responses  without  assuming  any  mathematical  form.  1982  Item  Response  The¬ 
ory  and  Computerized  Adaptive  Conference,  University  of  Minnesota,  1982,  Minneapolis, 
Minnesota,  U.  S.  A. 

(3)  Overview  of  latent  trait  models.  198 7  Annual  Meeting  of  Behaviormetric  Society  of  Japan, 
Kyushu  University,  1987,  Fukuoka,  Japan. 

At  the  same  conference  where  the  principal  investigator  presented  the  paper  described  in  (2),  she  also 
served  as  the  discussant  to  Dr.  Roderick  P.  McDonald’s  paper,  ” Unidimensional  and  multidimensional 
models  for  item  response  theory.” 

The  address  described  as  (3)  in  the  above  list  was  a  one  hour  special  lecture  overviewing  latent 
trait  models.  There  were  more  than  two  hundred  Japanese  researchers  in  behaviormetrics  among  the 
audience.  The  summary  of  the  paper  is  given  as  Appendix  B  of  this  report. 


[1.4]  Paper  Presentations  at  National  and  International  Conferences 

In  addition  to  the  three  invited  papers  which  were  listed  in  the  preceding  section,  there  were  other 
paper  presentations  at  national  and  international  conferences  introducing  the  principal  investigator’s 
work.  They  include  ONR  contractors’  meetings,  and  are  listed  below. 


(1)  Model  Validation,  Estimation  of  Plausibility  Functions,  Models  for  Cognitive  Processes, 
and  Effect  of  Noise  in  the  Three- Parameter  Logistic  Model.  ONR  Conference  on  Model- 
Based  Psychological  Measurement,  1983,  University  of  Illinois,  Champaign,  Illinois,  U.  S. 
A. 

(2)  A  Latent  Trait  Model  for  Differential  Strategies.  ONR  Conference  on  Action,  Attention 
and  Individual  Differences  in  Information  Processing,  1984,  Haskins  Laboratories,  New 
Haven,  Connecticut,  U.  S.  A. 

(3)  Specification  of  the  Information  Provided  by  Distractors  of  the  Multiple -Choice  Test  Item 
and  Efficient  Ability  Estimation.  American  Educational  Research  Association  Meeting, 
New  Orleans,  1984.  U.  S.  A. 

(4)  Efficient  Use  of  Distractors  in  Ability  Estimation  with  the  Multiple-Choice  Test  Item. 
American  Educational  Research  Association  Meeting,  New  Orleans,  1984.  (Coauthorship 
with  Paul  S.  Changas)  U.  S.  A. 

(5)  An  Application  of  Latent  Trait  Theory  *n  Analyzing  the  Field  Test  Results  of  a  Mathematics 
Proficiency  Test.  American  Educational  Research  Association  Meeting,  New  Orleans, 
1984.  (Coauthorship  with  Megumi  Asako  and  Allen  Knight)  U.  S.  A. 


(6)  Further  Investigation  of  the  Estimation  of  the  Item  Characteristic  Function  and  the  Plau¬ 
sibility  Functions  of  the  Multiple- Choice  Test  Item.  ONR  Conference  on  Model  Based 
Measurement,  1984,  Educational  Testing  Service,  Princeton,  New  Jersey,  U.  S.  A. 

(7)  A  Latent  Trait  Model  When  the  Item  Score  Distribution  Is  Partly  Continuous  and  Partly 
Discrete.  ONR  Conference  on  Model-Based  Psychological  Measurement,  1984,  Educational 
Testing  Service,  Princeton,  New  Jersey,  U.  S.  A. 

(8)  Latent  Trait  Models  Dealing  with  Continuous  Data.  American  Educational  Research  As¬ 
sociation  Meeting,  Chicago,  1985.  U.  S.  A. 

(9)  A  Content-Based  Investigation  of  Informative  Distractors  for  Multiple- Choice  Items  of  the 
Iowa  Tests  of  Basic  Skills.  American  Educational  Research  Association  Meeting,  Chicago, 
1985.  U.  S.  A.  (Coauthorship  with  Paul  S.  Changas) 

(10)  Expansion  of  the  General  Model  for  the  Homogeneous  Case  of  the  Continuous  Response 
Level  with  a  Partly  Continuous  and  Partly  Discrete  Item  Score  Distribution  in  the  Frame¬ 
work  of  Latent  Trait  Theory.  Psychometric  Society  50th  Anniversary  Meeting,  1985,  Van¬ 
derbilt  University,  Nashville,  Tennessee,  U.  S.  A. 

(11)  Latent  trait  t  ieory  as  applications  of  stochastic  processes.  Fifteenth  Conference  on  Stochas¬ 
tic  Processes  and  Their  Applications,  1985,  under  the  auspices  of  the  Committee  for  Con¬ 
ferences  on  Stochastic  Processes  of  the  Bernoulli  Society.  Nagoya  University,  Nagoya, 
Japan. 

(12)  Effect  of  the  guessing  parameter  on  the  estimation  of  the  item  discrimination  and  diffi¬ 
culty  parameters  when  three-parameter  logistic  model  is  assumed.  American  Educational 
Research  Association  Meeting,  San  FYancisco,  California,  1986,  U.  S.  A. 

(13)  Content-based  observation  of  informative  distractors,  bias  function  of  the  maximum  like¬ 
lihood  estimate  of  the  latent  trait  when  item  responses  are  discrete,  etc.  ONR  Conference 
on  Model-Based  Measurement,  Gatlinburg,  Tennessee,  1986,  U.  S.  A. 

(14)  Bias  function  of  the  maximum  likelihood  estimate  of  the  latent  trait  when  item  responses 
are  discrete.  American  Educational  Research  Association  Meeting,  Washington,  D.  C., 
1987,  U.  S.  A. 

(15)  Striving  for  the  refinement  of  the  conditional  P.D.F.  approach  for  estimating  the  operat¬ 
ing  characteristics  of  discrete  responses.  ONR  Conference  on  Model-Based  Measurement, 
Columbia,  South  Carolina,  1987,  U.  S.  A. 

(16)  A  robust  method  of  on-line  calibration.  American  Educational  Research  Association  Meet¬ 
ing,  New  Orleans,  1988,  U.  S.  A.  (Proposed  and  accepted.) 

Out  of  these  paper  presentations,  the  one  listed  as  (11)  was  made  at  the  international  conference 
held  in  Nagoya,  Japan.  Approximately  three  hundred  and  sixty  researchers,  the  majority  of  whom 
are  mathematicians,  participated  from  twenty-five  different  countries.  The  principal  investigator  also 
chaired  one  of  the  sessions  by  the  request  of  the  conference  organizer,  Professor  Takeyuki  Hida  of  Nagoya 
University. 


[1.5]  Book  Chapters 


Some  of  the  principal  investigator’s  works  were  published  as  book  chapters  in  two  books  during  this 
period.  They  are  as  follows: 


(l)  Some  methods  and  approaches  of  estimating  the  operating  characteristics  of  discrete  item 
responses.  In  H.  Wainer  and  S.  Messick  (Ed.),  Principals  of  Modern  Psychological  Measure¬ 
ment:  A  Festschrift  for  Frederic  M.  Lord,  pages  159-182.  New  Jersey:  Lawrence  Erlbaum, 
1983.  New  York:  Academic  Press,  1983. 


(2)  The  constant  information  model  on  the  dichotomous  response  level.  In  David  J.  Weiss 
(Ed.)  New  Horizons  in  Testing,  pages  287-308. 


Other  Events 


The  principal  investigator  hosted  an  annual  ONR  Conference  on  Model-  Based  Measurement  in  1986, 
on  April  27  through  30,  at  Park  Vista  Hotel,  Gatlinburg,  Tennessee.  Approximately  forty  researchers 
participated  in  the  conference. 


She  also  gave  a  seminar  on  "the  On-Line  Item  Calibration  Using  Nonparametric  Approaches,”  in 
July,  1987,  at  Educational  Testing  Service. 


II  Theory  and  Methods  for  Estimating  the  Operating 
Characteristics  of  Discrete  Item  Responses 

This  part  of  research  aimed  at  further  developments  and  modifications  of  the  methods  and  approaches 
developed  by  the  principal  investigator.  During  the  previous  four  years,  1977  through  1981,  the  principal 
investigator  had  been  engaged  in  the  research  sponsored  by  the  Office  of  Naval  Research,  under  the 
title,  "Efficient  Methods  of  Estimating  the  Operating  Characteristics  of  Item  Response  Categories  and 
Challenge  to  a  New  Model  for  the  Multiple-Choice  Item.”  One  of  the  main  outcomes  of  the  research  was 
various  methods  and  approaches  for  the  efficient  estimation  of  the  operating  characteristics  of  discrete 
item  responses.  They  are  listed  in  the  summary  of  the  principal  investigator’s  special  lecture  given 
at  Fukuoka,  Japan  (Appendix  B,  page  4),  and  the  computer  package  programs  for  these  methods  are 
described  in  [1.2]. 

Two  important  features  of  the  principal  investigator’s  approach  are  the  following. 

(1)  It  does  not  assume  any  specific  mathematical  form  for  the  operating  characteristic,  or  the 
conditional  probability,  given  the  latent  trait,  with  which  the  individual  subject  gives  a 
specific  discrete  response. 

(2)  It  does  not  require  a  large  sample  size  of  individual  subjects,  i.e.,  no  more  than  several 
thousand  and  sometimes  even  down  to  several  hundred. 

In  the  present  research,  computer  programs  written  in  the  previous  research  were  tested  with 
empirical  and  simulated  data,  revised  and  improved,  used  for  empirical  data  (cf.  Chapter  VII),  revised 
again,  and  so  forth.  Many  variations  of  these  programs  were  produced.  Among  others,  a  series  of 
variations  for  computerized  adaptive  testing  and  on-line  item  calibration  was  written.  A  new  method 
called  Lognormal  Approach  Method  was  proposed  and  tested.  The  bias  function  of  the  maximum  likeli¬ 
hood  estimate  was  conceived  and  proposed  (cf.  Chapter  III)  partly  from  the  necessity  for  increasing  the 
accuracy  in  the  operating  characteristic  estimation,  as  well  as  for  the  general  purpose  of  the  advancement 
of  latent  trait  theory. 

[II. 1]  Conditional  P.D.F.  Approach  Combined  with  the  Normal  Approach 
Method 

Out  of  these  different  approaches,  Bivariate  P.D.F.  Approach  may  be  the  most  orthodox  one.  It  has 
its  disadvantages  in  comparison  with  the  conditional  P.D.F.  Approach,  however,  in  the  sense  that:  1)  it 
requires  a  larger  sample  size  of  individual  subjects,  and  2)  its  CPU  time  is  substantially  greater  because 
of  the  fact  that  the  estimation  has  to  be  done  for  one  item  at  a  time.  On  the  other  hand,  in  spite  of 
the  additional  approximation  involved  in  the  Conditional  P.D.F.  Approach,  in  the  present  research  the 
results  of  this  approach  proved  to  be  quite  accurate  in  many  cases. 

Let  8  be  the  latent  trait,  or  "ability”,  which  assumes  any  real  number.  Let  r  be  the  transformed 
ability  which  is  strictly  increasing  in  9.  In  the  Conditional  P.D.F.  Approach,  the  conditional  density, 
4>{t  |  r,)  ,  of  r  ,  given  its  maximum  likelihood  estimate  t,  of  the  subject  s  ,  is  approximated  by 
some  specified  probability  density  function.  In  so  doing,  first  we  estimate  the  conditional  moments  of 
r  ,  given  r,  ,  and  then  use  the  method  of  moments  for  fitting  a  specific  probability  density  function. 
Three  methods,  i.e.,  Pearson  System  Method,  Two-Parameter  Beta  Method  and  Normal  Approach 
Method,  have  been  proposed  and  tested.  Out  of  these  three  methods,  Pearson  System  Method  provides 
us  with  more  varieties  of  "shapes”  for  the  conditional  density,  including  asymmetric  ones.  Thus  it  is 
theoretically  more  adequate  than  the  other  two,  i.e.,  Normal  Approach  Method  and  Two-Parameter 
Beta  Method.  These  two  methods  have  their  own  advantage,  however,  for  they  avoid  the  use  of  the  third 


and  fourth  moments  of  the  conditional  distribution  of  the  transformed  ability,  which  are  less  accurately 
estimated  than  the  first  and  second  moments. 

In  our  investigation,  Normal  Approach  Method  combined  either  with  the  Simple  Sum  Procedure 
of  the  Conditional  P.D.F.  Approach  or  with  the  Bivariate  P.D.F.  Approach  was  mainly  used  in  the 
estimation  of  plausibility  functions  and  in  the  model  validation.  The  results  suggest  that,  with  a  sample 
size  as  small  as  several  hundred,  while  there  are  no  visible  differences  between  the  two  in  estimating 
the  item  characteristic  functions  of  dichotomous  test  items,  the  first  combination  seems  to  be  better 
than  the  second  for  the  estimation  of  the  plausibility  functions  of  distractors  and  of  the  operating 
characteristics  of  graded  item  responses.  The  general  success  in  the  use  of  the  Normal  Approach 
Method  depends  upon  the  fact  that  the  conditional  distribution  is  indeed  approximately  normal  if  the 
transformed  ability  distribution  is  normal,  and  is  approximately  truncated  normal  if  the  distribution  is 
rectangular,  with  the  truncation  negligibly  small  for  the  wide  range  of  the  maximum  likelihood  estimate 
of  the  transformed  ability. 

For  the  reasons  described  above,  after  these  investigations,  Conditional  P.D.F.  Approach  combined 
with  the  Normal  Approach  Method  was  most  frequently  used  in  the  present  research.  Among  others, 
application  of  this  combination  to  the  estimation  of  the  plausibility  functions  of  the  wrong  alternative 
answers  of  the  Iowa  Tests  of  Basic  Skills  items  turned  out  to  be  very  successtul.  It  will  be  introduced 
in  Chapter  VII. 

[II. 2]  Lognormal  Approach  Method 

As  was  pointed  out  in  the  preceding  section,  the  estimation  of  the  conditional  moment  of  ability,  given 
its  maximum  likelihood  estimate,  becomes  less  accurate  as  the  degree  of  the  moment  advances.  Thus 
Pearson  System  Method  must  use  fairly  inaccurately  estimated  fourth  conditional  moments,  in  addition 
to  the  better  estimated  first  through  third  moments.  On  the  other  hand,  although  Normal  Approach 
Methods  has  an  advantage  of  solely  using  fairly  accurately  estimated  first  and  second  moments,  it  has  a 
disadvantage  of  forcing  the  estimated  conditional  density  function  into  symmetry.  This  forced  symmetry 
could  be  inappropriate  when  the  population  ability  distribution  departs  from  normality. 

For  this  reason,  it  will  be  a  logical  direction  of  research  to  pursue  another  method  which  uses  the 
first,  second  and  third  conditional  moments,  allowing  asymmetry  to  the  conditional  density  function 
of  ability  without  using  the  fourth  moment.  Thus  Lognormal  Approach  Method  was  developed  and 
proposed.  Although  it  has  not  been  published  in  a  research  report  yet,  it  was  introduced  at  the  1987 
ONR  Conference  on  Model-Based  Measurement  (cf.  [1.4.15]). 

Let  Mr  denote  the  r-th  conditional  moment  of  r  about  the  origin,  given  t,  ,  and  Mr  be  that  of 
t  about  the  mean,  respectively,  i.e. 

(2.1)  Mr  =  E{Tr  I  f.) 
and 

(2.2)  Mr  =  |r,[  . 

In  the  Lognormal  Approach  Method,  we  need  the  estimates  of  y.\  >  and  M3  for  each  f,  .  Let 
6  denote  the  skewness  index  such  that 


where 


■r, 

V 


r* 

>; 


£ 


i 


+  3  J  (r  -  a)(a  -  p.\ )24>(t  |  ?,)  dr  +  J  (a  -  \  t,)  dr 


=  txp{ 30  +  (9/2)-r2}  -  3  exp{ 3/9  +  (5/2 b2}  +  2  ezp^  +  (3/2)^} 

=  i/3w3/2(a;  -  l)2(w  +  2)  , 


(2.8)  a,  s  {72} 
and 

(2.9)  v  =  exp(fl)  . 

A  similar  but  somewhat  different  procedure  is  taken  when  6  <  0  .In  this  case,  the  approximated 
conditional  density  function  is  provided  by 

(2.10)  4>(t  |  ?.)  =  (a-r)"1(2x)_l/2'7_1ezp[-{log(d-f)-^}2/2-72] 
where  the  estimation  of  the  three  parameters  is  conducted  through  the  relationships 


■_ » 
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r  ' 


(2.11) 


(2.12) 


Mi  =  f  r<f‘(T  I  f>)dT 

J  —  OO 

=  —exp{f}  +  (l/2)72)  +  a  , 


M2  =  /  (r  -  /4)2*(r  |  ?.)  dr 

J  —OO 

-  [  (r-  a)2<f>(T  I  M  dr  -{a-  n\)2 

J  —OO 

=  exp{2fi  +  2^2}  —  exp{20  +  72} 

=  v2w(oj  —  1) 
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(2.13)  M3 


f  {r-n'lf4,(T\T.)dr 
J-oo 


=  [  (r-a)3<f>(T\f,)dr  +  3  f  (r  -  a)2(a  -  n\)4>{r  \  f,)  dr 

J—  OO  J  —OO 

+  3  [  (»■  -  <*)(<*  -  f*[)2<HT  I  f»)  dr  +  f  (a  -  ^i)3Ht  I  ?.)  dr 

J —OO  J  —  OO 


=  -exp{ 30  +  (9/2 )72}  +  3  txp{ 30  +  (5/2)^2}  -  2  exp{30  +  (3/2)72} 

=  -j/3w3/2(c;  -  l)2(w  +  2)  . 


The  actual  procedure  starts  from  obtaining  the  estimate  of  6  from  (2.3),  and  then  that  of 
the  relationship 

(2.14)  6 2  =  (w  —  l)(w  +  2)2  . 

Then  we  proceed  to  estimate  the  two  parameters  7  and  ft  concurrently  through 

(2.15)  7  =  [logw]1/2  , 

(2.16)  »/  =  M2/2w-i/2(w-iri/2 

and 

(2.17)  f)  =  logi/  , 
and  finally  we  obtain  the  estimate  of  a  through 


(  =  Mi  +  «zp{/9  +  (l/2)72}  6  <  0 


through 


(2.18) 


a 


=  Mi  -  «*p{/?+  (l/2)72} 


<5  >  0  . 


We  tested  this  method  implemented  in  the  Conditional  P.D.F.  Approach  with  some  simulated  data, 
and  the  results  turned  out  to  be  at  least  as  good  as  those  obtained  by  the  Normal  Approach  Method. 
Figure  2-1  illustrates  six  examples  of  the  estimated  conditional  density  of  r  ,  given  f,  ,  in  comparison 
with  the  true  density  function  and  the  one  estimated  by  the  normal  approach  method.  We  can  see  that 
the  improvement  is  substantial  when  the  true  curve  is  either  negatively  or  positively  skewed.  This  is 
happening  when  t,  is  much  greater  or  much  less  than  the  mean  of  f,  . 

True  appreciation  of  the  method  will  be  reached,  however,  when  it  is  tested  against  data  having 
unconditional  distributions  of  r  which  are  substantially  deviated  from  normality  and  from  uniformity. 
This  will  be  done  in  a  separate  research  in  the  near  future. 

[II.  3]  Discussion 

This  part  of  research  included  a  substantial  amount  of  computer  programming  not  only  for  making 
Lognormal  Approach  Method  accessible  but  also  for  modifying  and  improving  the  already  written 
package  programs.  Another  orientation  was  taken  to  adjust  these  methods  and  approaches  to  the 
computerized  adaptive  testing.  This  is  still  in  the  progress,  and  will  be  reported  in  a  separate  research 
in  the  future. 

There  are  many  other  developments  and  findings  which  are  not  given  here.  The  reader  who  is 
interested  is  directed  to  the  separate  research  reports  and/or  to  personal  conversations  with  the  principal 
investigator. 


0  12  3 

4 

TAU 

JURE  2-1 

Line)  Approximating  the  TVuth  Curve  (Solid  Line)  in 
lal  Density  Curve  (Dashed  Line). 

[2 

A  K_n  kJ*.  kJI  Kk  «rw  MV  aru  f  u  irv 

m  wvitfv  v  im.-m  m;ui 

s? 


► \ 


r* 


I 


'£ 

c? 

b 


III  Bias  Function  of  the  Maximum  Likelihood  Estimate  of 
Ability  in  the  General  Discrete  Response  Case 

In  those  theory  and  methods  developed  for  estimating  the  operating  characteristics  of  discrete  item 
responses,  which  were  described  in  the  preceding  chapter,  the  maximum  likelihood  estimate  9  of  ability 
9  ,  and  also  f  of  the  transformed  ability  r  play  important  roles.  Since  in  these  methods  the  asymptotic 
unbiasedness  and  normality  of  the  conditional  distribution  of  the  maximum  likelihood  estimate,  given 
the  true  parameter,  is  used  as  approximation,  it  is  of  our  serious  concern  whether  indeed  the  maximum 
likelihood  estimate  is  practically  conditionally  unbiased  or  not  with  actual  data  for  the  interval  of  ability 
of  interest.  For  this  reason  and  for  many  others,  in  the  second  half  of  the  research  period,  the  bias 
function  of  the  maximum  likelihood  estimate  of  ability  was  investigated  in  the  general  case  where  item 
responses  are  discrete.  In  this  chapter,  the  outline  of  its  main  outcomes  will  be  described.  For  details 
and  more  information,  see  [1.1.9]. 

[III.l]  Background 

Lord  has  proposed  and  discussed  a  bias  function  of  the  maximum  likelihood  estimate  in  the  context 
of  the  three-parameter  logistic  model  (cf.  Lord,  1983).  In  so  doing,  he  used  Taylor’s  expansion  of  the 
likelihood  equation  and  proceeded  from  there,  obtained  an  equation  which  includes  the  conditional 
expectation  of  the  discrepancy  between  the  maximum  likelihood  estimate  and  the  true  ability,  and 
ignored  all  terms  of  orders  higher  than  n-1  ,  where  n  indicated  the  number  of  items.  Let  Pg(9)  be 
the  item  characteristic  function  or  item  response  function  in  the  three-parameter  logistic  model,  which 
is  given  by, 


(3.1) 


Pa{9)  =  cg  +  (l-cg)[l  +  exp{~Dag(e-bg)}]-1 


where  ag  ,  bg  ,  and  cg  are  the  item  discrimination,  difficulty,  and  guessing  parameters,  and  D  is  a 
scaling  factor,  which  is  set  equal  to  1.7  when  the  logistic  model  is  used  as  a  substitute  for  the  normal 
ogive  model.  Lord’s  bias  function  B(9)  can  be  written  as 


(3.2) 
where 

(3.3) 


m  =  w)]-2  £>„/,(*) [*,(«)-£] , 


17=1 


*g(&)  =  [1  +  exp{-Dag(0  -  bg)}]  1  , 


and  Ig(9)  and  1(9)  are  the  item  information  function  and  the  test  information  function,  respectively, 
which  are  given  by 


(3.4) 

and 

(3.5) 


m  =  W)iWKi  -^wr1  . 


m  =  E  . 

0=1 
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with  Pg{9)  indicating  the  first  derivative  of  Pa{9)  with  respect  to  6  .  The  former  of  these  two 
formulae  can  be  given  as  a  special  case  of  the  item  information  function  (Samejima,  1969,  1972),  which 
is  defined  for  the  general  case  of  discrete  responses.  (Incidentally,  in  Lord’s  paper,  Bi(9)  is  used 
for  this  bias  function.  This  is  not  appropriate,  however,  since  it  is  a  function  of  9  itself,  not  of  its 
maximum  likelihood  estimate  9  .) 

[III. 2]  Rationale 

A  similar  logic  can  be  adopted  for  the  general  case,  in  which  item  responses  are  simply  discrete.  We 
assume  that  there  are  a  finite  or  an  enumerable  number  of  discrete  responses  kg ’s  as  possible  responses 
to  item  g  .  Thus  for  the  set  of  n  items,  we  can  write  for  the  response  pattern  V 

(3.6)  V'  =  (kuk2,...,kg,...,kn)  . 

We  assume  that  the  operating  characteristic  Pks  (0)  is  three-times  differentiable  with  respect  to  9  . 
By  virtue  of  local  independence,  we  can  write  for  the  likelihood  function 

(3.7)  Lv(6)  =  /V(*)  =  If  PU*)  • 

kg€V 

Thus  the  likelihood  equation  is  given  by 

(3-8)  ^  log  Lv  (#)  =  lo«pki,(0)  =  0  • 

kg€V 

We  define  such  that 

(3-9)  T.k,(9)  =  J^-log  Pkl(8) 

for  3=1,2,...  We  notice,  in  particular,  that 

(3-10)  rlfcs(0)  =  /^[P*^)]-1  =  Akg{9)  , 

where  Ak9(9)  is  the  basic  function  (Samejima,  1969).  Let  T,v(^)  be  defined  by 

(3-11)  r,v(0)  =  E  r,fcs(0) 

k,£V 

for  3=1,2,...  For  a  fixed  value  of  9  we  can  write  by  Taylor’s  formula 


(3.12)  riv(^v)  =  ^  iv{9)  +  {&v  —  9)Y2v(9)  +  (\/2){9v  —  9)2T2v[Q) 


+  (1/6)(0V  -  0)3IV(0)  +  (1/24)(0V  -  0)4IVU)  =  0 


where  £  is  some  value  between  0  and  dy  • 


Since  we  have 


£*M*)  =  i  . 


we  obtain 


(3-14)  =  0 

kg 

for  t  =  1,2,...  Equation  (3.14)  will  be  helpful  in  following  the  mathematical  derivations  which  are 
needed  in  obtaining  the  bias  function.  Let  7,0(0)  be  the  conditional  expectation  of  r,fc9(0)  ,  given 
0  ,  which  can  be  written  as 

(3-15)  i.a(9)  =  £|r.*,(«)M  =  £r,fc9(*)pfc9(0)  . 

kg 

In  particular,  we  have  from  (3.10)  and  (3.14) 


Tfi.  =  £^,(*)  =  0 


We  further  define  7,(0)  ,  c,fc9(0)  and  e,v(0)  such  that 


7.(0)  =  (l/n)^7,a 


for  s  =  1, 2, ...  , 


«.*.(')  =  r.fcf(tf)-7.,W 


e,v(0)  =  (1/n)  «•*.(*)  • 


respectively.  For  the  conditional  expectation  of  e,v  (0)  ,  given  0  ,  we  obtain 


E\t.v{9)\6]  =  £\.v(0)/V(0)  =  7.(*)~7.(*)=0 


With  these  definitions  of  7,(0)  and  e,v(0)  and  from  (3.12)  we  have 


■jMr.M.V’i’.M 


(3.21) 


*1V  (0)  +  (§v  -  9)[l2(9)  +  C2V  (#))  +  (1/2)(^K  -  02[T3(^)  +  «3V  (#)) 

+  (1/6) (9„  -  0)3(74(0)  +  (5)1  +  (1/24 )(0V  -  0)*rsv  (0)  =  o 


and  proceeding  from  here  by  taking  the  conditional  expectation  of  each  term  with  respect  to  V  ,  given 
8  ,  and  ignoring  all  terms  whose  orders  are  higher  than  n-1  ,  we  obtain 

(3.22)  E[e iv  (0)  |  0]  +  12(6)  E[8V  -  6  |  6}  +  E{( 0V  -  0)*2V  (0)  |  0} 

+  (1/2)13(6) E[(8V  -8)2  |  9)  =  0  . 


It  is  obvious  from  (3.20)  that  the  first  term  on  the  left  hand  side  of  (3.22)  disappears.  As  for  the  fourth 
and  last  term  in  (3.22)  we  can  use  the  asymtotic  variance  of  the  distribution  of  the  maximum  likelihood 
estimate  as  the  approximation  to  its  last  factor,  i.e., 

(3.23)  E[(6V  -  6)2  |  6]  =  [/(9)]'1  . 


Thus  all  that  is  left  to  do  is  to  evaluate  the  third  term  on  the  left  hand  side  of  (3.22)  in  the  general 
framework.  From  this  we  obtain 


(3.24)  E\(0v-0)ew(6)  |  9]  =  (l/nM/p)]'1  £  £  Afc,(9)[P"(9)  -  Afcj(0)J^(0)]  , 

9=  1 

where  P'k  (0)  and  P'k'  (0)  indicate  the  first  and  second  derivatives  of  Pfc9(0)  ,  with  respect  to  0 
respectivefy.  Substituting  (3.15),  (3.17),  (3.23)  and  (3.24)  into  (3.22)  and  rearranging,  we  obtain  for 
the  bias  function  of  the  maximum  likelihood  estimate 


(3.25)  B(9)  =  E[6V  -  9  |  0]  =  -(l/2)[/(0))-2  £  £>f(9)/£  (0) 

9=1  k, 

=  -(1/2 )\i(0))-2f^J2pkAe)p^e)lp^M~l  • 

9=1  k , 

On  the  graded  response  level  where  item  score  xg  assumes  successive  integers,  0  through  mg  ,  each 
kg  in  (3.25)  must  be  replaced  by  xg  .  On  the  dichotomous  response  level,  it  can  be  reduced  to  the 
form 


n 

B(8)  =  E\0V  —  0  |  0)  =  (-l/2)[/(0)]-2^4(0)P;(0)lP'(0)]-1 


(3.26) 


with  Pg[0)  indicating  the  second  derivative  of  Pg(8)  with  respect  to  9  .  This  includes  Lord’s  bias 
function  in  the  three-parameter  logistic  model  as  a  special  case.  In  the  normal  ogive  model,  the  item 
characteristic  function  is  given  by 


fag(e-b9) 

Pg(8)  =  {2n)-1'2  /  !2du 

J  —OO 


where  ag  and  bg  are  the  item  discrimination  and  difficulty  parameters,  respectively.  From  (3.27),  we 
can  write  for  the  first  and  second  derivatives  of  Pg(8)  with  respect  to  6 


P'gid)  = 


W  =  ~a2g(6-bg)P'g(9)  , 


respectively.  Substituting  (3.28)  and  (3.29)  into  (3.26)  and  rearranging,  we  obtain  for  the  bias  function 


B{8)  =  (1  l2)m\-2Y,alV-bM°)  - 

g=l 


In  the  (two-parameter)  logistic  model,  the  item  characteristic  function  is  given  by 


Pg{9)  =  +  _ 


which  is  the  same  as  4^(0)  in  (3.3).  The  bias  function  is  the  same  as  (3.2)  ,  therefore,  by  obtaining 
Ig(8)  and  1(6)  by  setting  cg  =  0  . 


III. 3]  Bias  Function  and  Amount  of  Test  Information 


We  shall  introduce  some  examples  now.  In  developing  nonparametric  approaches  and  methods  of 
estimating  the  operating  characteristics,  or  the  conditional  probabilities,  given  ability  8  ,  which  are 
assigned  to  separate  discrete  item  responses,  a  set  of  simulated  data  has  been  used  for  testing  these 
approaches  and  methods,  in  which  35  graded  test  items  following  the  normal  ogive  model  with  three 
item  score  categories  each  are  hypothesized  as  the  Old  Test  (cf.  Samejima,  1977,  1981).  The  square 
root  of  the  test  information  function  of  this  Old  Test  is  shown  as  the  upper  solid  curve  in  Figure  3-1. 
The  bias  function,  which  was  computed  through  (3.25),  is  also  shown  in  the  same  figure  as  the  lower 
solid  curve.  We  can  see  in  this  figure  that  for  the  interval  of  8  covering  (—4,4)  the  bias  of  the 
maximum  likelihood  estimate  is  practically  zero,  i.e. ,  the  MLE  of  ability  is  practically  unbiased  for  this 
range  of  8  .  Thus,  one  of  the  necessary  conditions  to  justify  the  use  of  the  asymptotic  normality  as  the 
approximation  for  the  conditional  distribution  of  MLE,  given  6  ,  is  satisfied. 


We  notice  that  for  the  range  of  8  ,  (  —  3,3)  .  the  square  root  of  the  test  information  function  of 
this  Old  Test  assumes  approximately  a  constant  value,  4.65  ,  and  we  have  already  seen  that  for  the 
wider  range  of  8  the  bias  function  assumes,  practically,  zero.  It  is  interesting  to  note  that  the  bias 


$TV 


starts  showing  up  both  positively  and  negatively  when  the  square  root  of  test  information  drops  lower 
than  a  critical  value,  which  is  approximately  3.2  ,  or  the  test  information  function  drops  lower  than 
approximately  10  .  In  order  to  pursue  this  relationship,  two  more  sets  of  these  two  functions  are  also 
shown  by  dashed  and  dotted  curves  in  Figure  3-1.  These  two  sets  were  created  by  redichotomising  the 
graded  items  of  the  Old  Test,  using  the  first  and  second  sets  of  the  difficulty  parameters,  respectively. 
We  can  see  that  for  the  wide  range  of  9  the  square  root  of  the  test  information  is  substantially  less 
than  that  of  the  original  Old  Test,  which  is  the  natural  consequence  of  redichotomiiing  the  items.  We 
notice  that  for  each  of  these  two,  the  square  root  of  the  test  information  function  is  barely  greater  than 
3.2  for  a  wide  range  of  9  ,  and  the  bias  is  practically  nil.  Again  the  bias  appears  both  positively  and 
negatively  when  the  square  root  of  the  test  information  function  drops  lower  than  approximately  3.2  . 
If  we  tolerate  the  biases  of  ±0.1  ,  then  the  critical  value  of  the  square  root  of  test  information  will 
approximately  be  2.75  ,  or  that  of  the  test  information  function  approximately  7.5  .  When  the  square 
root  of  test  information  drops  less  than  2.0  ,  the  bias  turns  out  to  be  substantially  large. 

Figure  3-2  presents  similar  results  by  dashed  and  dotted  curves,  respectively,  which  are  based  upon 
two  sets  of  empirical  data.  The  first  set  is  the  results  of  the  Level  11  Vocabulary  Subtest  of  43  items  of 
the  Iowa  Tests  of  Basic  Skills  collected  for  2,356  school  children  of  approximately  age  eleven,  and  the 
second  set  is  those  of  the  Test  J1  of  Shiba’s  Word/Phrase  Comprehension  Tests  of  55  items  collected 
for  2,259  junior  high  school  students  (cf.  Samejima,  1981).  Both  sets  of  operating  characteristics  are 
estimated  by  assuming  the  normal  ogive  model.  In  these  two  cases,  the  critical  value  of  square  root 
of  test  information  when  we  tolerate  biases  of  ±0.1  turned  out  to  be  less,  i.e.,  approximately  1.75  , 
or  the  critical  value  of  the  test  information  function  is  approximately  3.0  .  These  differences  seem  to 
have  something  to  do  with  the  fact  that  in  the  Old  Test  there  are  only  35  test  items  with  the  average 
discrimination  parameter  as  high  as  1.70  ,  while  there  are  as  many  as  43  and  55  items  in  the  Iowa 
Subtest  and  Shiba’s  J1  Test  with  the  average  values  of  discrimination  parameters  0.601  and  0.538  , 
respectively  for  tha  Iowa  Level  11  Vocabulary  subtest. 

We  can  see  in  this  figure  that  the  bias  is  practically  nil  for  tha  range  of  9  ,  (-2,2)  ,  where 

approximately  95  percent  of  the  subjects  are  located.  This  interval  is  even  wider  for  Shiba’s  test  Jl. 
This  fact  proves  exellence  of  the  tests  in  this  aspect.  These  two  tests  were  more  thoroughly  analysed  in 
the  present  research  project,  and  these  results  will  be  obtained  in  chapters  VII  and  VIII,  respectively. 
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[III. 4]  Increment  in  Bias  Caused  by  Random  Guessing 

The  two  graphs  in  Figure  3-3  show  the  increment  in  bias  caused  by  the  guessing  parameters  for  the 
Iowa  Level  11  Vocabulary  Subtest  and  Shiba’s  Jl  Test,  respectively.  In  each  graph,  the  solid  curve 
indicates  the  bias  function  based  upon  the  logistic  model,  while  the  dashed  and  dotted  curves  are  the 
bias  functions  based  upon  the  three-parameter  logistic  model,  with  the  guessing  parameters,  0.20 
and  0.25  ,  respectively,  added  to  the  same  discrimination  and  difficulty  parameters  of  each  item.  It 
is  obvious  that  a  substantial  increment  in  bias  is  caused  by  the  addition  of  the  guessing  parameter, 
especially  on  the  lower  levels  of  ability. 


I? 


[III. 5]  Adaptive  Testing 

Observations  made  in  the  previous  sections  provide  us  with  ideas  how  things  go  in  adaptive  testing. 
First  of  all,  in  order  to  reach  the  practical  unbiasedness  in  estimating  the  individual  subject’s  ability  in 
adaptive  testing,  we  need  to  make  sure  that  a  sufficient  amount  of  test  information  has  been  reached  for 
each  individual  subject,  before  terminating  the  presentation  of  new  items.  We  can  control  it  easily,  if  we 
use  the  amount  of  test  information  as  the  criterion  for  the  termination  of  presenting  new  items,  or  as  the 
"stopping  rule".  If  the  items  follow  the  normal  ogive  or  logistic  model  in  the  adaptive  testing  situation, 
for  subjects  of  intermediate  ability  levels  it  is  likely  that  on  the  initial  stage  the  item  difficulty  parameters 
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FIGURE  3-3 


MLE  Bias  Fane  Cion  Based  upon  the  Three- Parameter  Logistic  Model  with  the  Gaeeeing  Parameters 
0.20  (Dashed  Curve)  and  0.25  (Dotted  Curve),  Respectively,  in  Comparison  with  the  One  Based  upon 
the  (Two- Parameter)  Logistic  Model,  for  the  Iowa  Level  11  Vocabulary  Subtest  (Upper  Graph)  and  for 
Shiba’i  Word  /Phrase  Comprehension  Test  Jl  (Lower  Graph). 
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fluctuate  both  negatively  and  positively  around  the  subject’s  true  ability  level,  and  consequently,  the 
biases  of  negative  and  positive  directions  are  cancelled  out,  since  an  item  pool  usually  has  plenty  of 
items  of  intermediate  difficulties.  In  such  a  case  we  do  not  have  to  worry  too  much  about  the  influence 
of  the  initial  items  on  the  eventual  bias  of  the  ability  estimate.  When  the  maximum  likelihood  estimate 
has  started  being  more  or  less  stabilized,  chances  are  slim  that  the  additional  item  causes  a  substantial 
bias,  provided  that  the  program  is  written  in  such  a  way  that  an  item  of  a  large  amount  of  information 
at  the  current  estimated  ability  level  will  be  presented  next,  and  that  the  item  pool  has  a  sufficient 
number  of  items  whose  difficulty  levels  are  arround  the  subject’s  true  ability  level.  There  is  greater 
possibility  that  the  examinee  obtains  a  biased  ability  estimate  if  his  ability  level  is  close  to  either  end 
of  the  configuration  of  difficulty  parameters,  since  biases  caused  by  the  initially  presented  items  are  not 
likely  to  cancel  themselves  out,  and,  moreover,  there  may  not  be  a  sufficient  number  of  items  whose 
difficulty  levels  are  close  to  his  ability  level. 

If  the  item  pool  consists  of  items  following  the  three-parameter  normal  ogive  or  logistic  model,  the 
effect  of  random  guessing  on  the  amount  of  bias  can  be  substantial,  especially  on  the  lower  levels  of 
ability.  In  such  a  case,  it  is  imperative  to  include  many  easy  items  in  the  item  pool. 

In  any  case,  the  bias  function  can  be  a  good  indicator  in  evaluating  the  item  pool,  if  we  use  it 
wisely  and  effectively.  Those  results  that  were  described  in  previous  sections  will  give  us  information 
and  suggestions  as  to  how  to  improve  an  existing  item  pool. 

[III.6]  Discussion 

It  has  been  observed  that:  1)  the  amount  of  bias  of  the  maximum  likelihood  estimate  increases  with 
the  decrease  of  the  amount  of  test  information,  and  there  seems  to  be  a  relatively  simple  relationship 
between  the  two;  2)  on  the  other  hand,  it  seems  that  the  configuration  of  the  discrimination  and  difficulty 
parameters  within  a  test  and  the  number  of  items  affect  the  amount  of  bias;  and  3)  random  guessing 
increases  the  amount  of  bias  especially  on  the  lower  levels  of  ability.  A  usefulness  of  the  bias  function 
is  seen  in  developing  theory  and  methodologies  using  the  normal  approximation  of  the  conditional 
distribution  of  the  MLE  of  ability,  given  8  ,  as  we  have  seen  in  the  nonparametric  estimation  of  the 
operating  characteristics  of  discrete  item  responses. 

There  are  many  other  developments  and  observations  concerning  the  MLE  bias  function  that  are 
not  presented  here.  Among  others,  they  include  comparison  of  different  models,  the  effect  of  the 
discrimination  parameters  on  the  bias  function,  the  effect  of  the  number  of  items  on  the  bias  function, 
the  bias  function  after  the  scale  was  transformed  including  the  general  case  of  discrete  responses  and 
the  equivalent  item  case  on  the  dichotomous  response  level,  and  the  effect  of  the  scale  transformation 
to  generate  a  constant  test  information. 
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IV  Constancy  in  Item  Information  and  the  Information  Loss 
Caused  by  Noise  on  the  Dichotomous  Response  Level 

Researchers  tend  to  consider  that  dichotomous  items  with  high  discrimination  parameters  are  "good” 
items,  and  those  with  low  discrimination  parameters  are  "bad”  ones.  This  is  not  necessarily  true, 
however.  If,  for  example,  in  our  item  pool  of  binary  test  items  most  items  are  of  similar  levels  of 
difficulty,  items  with  low  discrimination  parameters  will  be  more  informative  and  useful  for  testing 
individual  subjects  on  the  levels  of  latent  trait,  or  "ability”,  which  substantially  depart  from  these 
levels  in  either  positive  or  negative  direction. 

The  principal  investigator  has  pointed  out  {Samejima,  1979)  that  there  is  a  constancy  in  the  amount 
of  information  given  by  items  regardless  of  the  values  of  their  discrimination  parameters,  provided  that 
these  items  have  the  same  type  of  item  characteristic  functions.  This  was  discussed  in  the  unidimensional 
latent  space.  The  constancy  exists  in  the  square  root  of  the  item  information  function  integrated  for  the 
entire  range  of  ability  9  .  Among  others,  it  has  been  shown  that,  if  the  item  characteristic  function  is 
strictly  increasing  in  9  with  zero  and  unity  as  its  lower  and  upper  asymptote,  respectively,  as  is  the  case 
with  the  normal  ogive  model,  logistic  model,  linear  model,  etc.,  this  total  area  under  the  square  root  of 
the  item  information  function  equals  x  ,  regardless  of  mathematical  formulae  representing  particular 
models.  It  has  also  been  shown  that,  if  the  model  has  a  lower  asymptote  greater  than  zero,  as  is  the 
case  with  the  three-parameter  normal  ogive  and  logistic  models,  etc.,  the  constancy  still  exists,  but  the 
area  decreases  as  the  lower  asymptote  increases.  It  will  be  worthwhile  to  investigate  this  constancy  of 
item  information  across  different  models  and  item  parameters  in  a  more  general  framework,  and  also 
to  investigate  the  amount  of  information  loss  caused  by  noise,  such  as  the  guessing  parameter  in  the 
three-parameter  normal  ogive  or  logistic  model,  etc.  One  reason  for  the  necessity  of  such  a  research 
is  the  fact  that  the  three-parameter  logistic  model  has  been  applied  by  so  many  researchers  without  a 
deep  enough  understanding  of  the  model.  Another  reason  is  that  we  need  to  know  more  about  different 
types  of  models  in  order  to  use  them  for  different  purposes  of  research.  One  such  example  of  necessities 
will  be  given  in  modeling  differential  strategies  in  cognitive  processes,  which  will  be  outlined  in  chapter 
V. 

The  principal  investigator  pursued  these  topics  described  above  and  in  this  chapter  its  summary  will 
be  presented.  For  further  detail  and  more  information,  see  [I.l.l]  and  [1.1.2]. 

[IV. l]  Four  Types  of  Models  for  Dichotomous  Test  Items 

We  assume  that  the  item  characteristic  function,  Pa(9)  ,  is  strictly  increasing  in  ability  9  ,  for  the 
interval 


i<  9  < 


where  9  and  9  may  be  negative  and  positive  infinities,  respectively,  or  finite  numbers.  This  interval, 
(9,9)  ,  can  either  be  the  whole  range  of  ability  9  which  is  common  for  all  items,  or  a  subinterval 

specified  for  a  particular  item.  Let  cg i  and  c„2  denote  the  lower  and  upper  asymptotes  of  the  item 
characteristic  function,  where 


0  <  Cgl  <  Cg2  <  1 


Four  types  of  models,  Types  A,  B,  C  and  D,  are  considered  in  this  research,  which  are  represented 
by  the  general  formula  for  the  item  characteristic  function  Pg(d)  such  that 

(4.3)  =  csi  +  (c»2  ~  c9i)^a(^)  i 

where  is  a  strictly  increasing  function  of  d  for  the  interval  (0,6)  ,  with  zero  and  unity  as  its 

lower  and  upper  asymptotes,  respectively.  These  four  types  of  models  are  distinguished  from  each  other 
by  the  values  of  cg i  and  cg 2  which  are  the  listed  below. 


Type  A:  0  =  cgl  <  cg2  =  1 
Type  B:  0  <  cgl  <  cg2  =  1 
Type  C:  0  =  cgl  <  c#2  <  1 
Type  D:  0  <  cgl  <  cs2  <  1 


Figure  4-1  presents  a  set  of  examples  of  these  four  types  where  ^(0)  is  the  item  characteristic  function 
of  the  normal  ogive  model,  which  is  given  by 

f<lg(B—bg) 


%( (9)  =  (2x)-1/2  /  /2  du  , 
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with  the  parameters,  afl  =  1.0  and  bg  =  0.0  .  For  Types  B  and  D,  we  have  cg  1  =  0.2  ,  and,  for  Types 
C  and  D,  cg2  =  0.8  .  We  notice  that  the  example  for  Type  A  which  is  given  in  Figure  4-1  is  the  normal 
ogive  model  itself,  and  that  for  Type  B  is  the  three-parameter  normal  ogive  model.  We  have  no  specific 
models  of  Types  C  and  D  which  are  commonly  used  yet.  As  will  be  pointed  out  in  Chapter  V,  however, 
models  of  Type  C,  in  particular,  have  important  roles  in  dealing  with  items  of  multi-correct  answers 
and  in  modeling  differential  strategies  in  cognitive  processes.  Investigating  the  characteristics  of  this 
type  of  models  will  be  just  as  important,  therefore,  for  further  advancement  of  latent  trait  theory. 

[IV. 2]  Information  Loss 

Let  Q  denote  the  total  information,  which  is  defined  by 


Q  =  f°°  (M*)]1/a  d9  . 

J  —  00 


where  /„(#)  is  the  item  information  function  for  which  we  can  write 

(4.6)  /„(*)  =  &p9{9)?ipg(*)\-i\i-pgm-1  • 


Following  a  sequence  of  logics  and  mathematics,  we  obtain  for  the  total  information 
(4.7)  Q  =  2[tan-1{-^— }1/2-tan-1{-^—  j1/2]  . 

1  -  Ca2  1  “  Col 


Examples  of  Item  Characteristic  Functions  of  Types  A,  B,  C  and  D. 


It  is  obvious  from  (4.7)  that,  when  item  g  belongs  to  Type  A,  as  is  the  case  with  the  normal  ogive 
model,  logistic  model,  linear  model,  constant  information  model,  etc.,  the  second  term  in  the  second 
factor  of  the  right  hand  side  of  (4.7)  disappears,  and  also  the  first  term  takes  on  the  maximal  value,  n/2. 
Thus  we  have 


(4.8) 


Q  =  *  > 


the  result  which  is  consistent  with  our  previous  finding  (Samejima,  1979).  We  can  also  see  from  (4.7) 
that,  as  c9i  departs  from  zero,  and  ca2  from  unity,  the  total  information  Q  becomes  progressively 
smellier  than  ir  .  In  the  three  examples  shown  in  Figure  4-1,  we  obtain  Q  =  2.214  for  both  Types  B 
and  D,  and  Q  =  1.287  for  Type  D.  Also  we  can  easily  see  from  (4.7)  that  the  total  information  Q 
assumes  the  same  value  for  Types  B  and  C  whenever  cg i  =  1  —  c^  • 


[IV. 3]  Basic  Functions  and  Item  Response  Information  Functions 

There  are  certain  models  for  binary  items  which  assure  the  existence  of  a  unique  maximum  for  the 
likelihood  function  of  every  possible  response  pattern,  such  as  normal  ogive  model,  logistic  model,  etc. 
In  fact,  except  for  the  two  extreme  response  patterns  in  which  the  binary  item  score  ug  assumes  zero 
for  all  items,  and  unity  for  all  items,  respectively,  the  likelihood  function  has  a  unique  local  maximum 
in  those  models.  It  has  been  pointed  out  (Samejima,  1969,  1972)  that  a  sufficient  condition  for  the 
unique  maximum  is:  l)  that  the  basic  function,  which  is  given  by 

(4-9)  Au,(8)  =  (-1)“’+1  ^  log Pg(6)  ug  =  0, 1  . 


is  strictly  decreasing  in  8  throughout  its  whole  range,  and  2)  that  its  upper  asymptote  is  non-negative 
and  its  lower  asymptote  is  non-positive.  For  brevity,  we  shall  call  it  the  unique  maximum  condition. 
This  condition  implies  that  the  item  response  information  function  is  positive  except,  at  most,  at  an 
enumerable  number  of  points  of  8  .  It  has  been  shown  (Samejima,  1972,  1973b)  that  the  three- 
parameter  logistic  model  does  not  satisfy  the  unique  maximum  condition,  and  that  the  likelihood 
function  for  certain  response  patterns  has  more  than  one  modal  point.  The  same  is  true  with  the 
three-parameter  normal  ogive  model. 


There  also  are  models  of  Type  A  which  do  not  satisfy  the  unique  maximum  condition,  which  is 
exemplified  by  the  linear  model.  For  simplicity,  let  4^(0)  denote  the  first  partial  derivative  of  Vtg(8) 
with  respect  to  9  ,  and  ¥"(0)  be  the  second  partial  derivative.  We  can  write  for  the  basic  function 


=  ~(cg2  -  C,l)  WI(1  -  <*)  -  (‘*2  " 


(4.10)  Au,(0) 


(ca  2  cgl)^'  (9)\cf)i.  +  (eff2  cal )  (  ^)  ] 


=  0 

Ug  =  I  . 


for  the  general  form  of  the  item  characteristic  function  which  is  specified  by  (4.3).  The  item  response 
information  function  (0)  is  defined  by 


(4.11) 


fu,^)  —  QQ  ■^«,(^)  I 


28 


rir 

& 


and  from  (4.3)  and  (4.10)  we  can  write 


=  [fa  -  ca  i)2[{*' (*)}2  +  {1  "  *»(«)}*"(«)]  +  (1  -  c*){co2  -  ctl) 


wiki -*.!>-(<* 


Mg  =  Q 


(4.12)  /-.(«)  =  {{cg2  _  Cfll)2[  W)}2  -  *,(*)*"(*)]  -  cgl(cg2  -  cglme)} 


[Cgl  +  {cg2  -  C#l)*„W]- 


Ug  =  1  . 


These  formulae  can  be  simplified  for  Types  A,  B  and  C  by  substituting  cgi  by  zero  and/or  cg2  by 
unity. 


If  we  specify  ¥0(0)  by  the  logistic  function  such  that 


*„((?)  =  +  , 


then  we  have 


W  =  D  ag  *,(*)[!  -*,(«)]  , 


W  =  Da  «**,(«)[! -»f(tf)](l- 2  *,(•)]  - 


Figure  4-2  presents  four  examples  of  the  item  response  information  functions  of  Types  A,  B.,  C  and 
D,  with  specified  item  parameters. 


As  we  can  see  in  these  examples,  in  certain  cases  the  item  response  information  function  assumes 
negative  values  for  a  specific  interval  of  6.  Let  tg  denote  the  critical  value  of  8  below  which  the 
item  response  information  function  of  Type  B  or  D  assumes  negative  values  for  ug  =  1  ,  and  6g  be 
the  one  above  which  the  item  response  information  function  of  Type  C  or  D  takes  on  negative  values 
for  rig  =  0  .  In  general,  we  can  write 


ia  =  —(2  Dag)  1  [log  Cg2  log  Cgi  ]  -f-  bg 


8g  =  (2  Z?a9)_1  [!og(l  —  Cgl)  —  log  (  1  —  Cg2)]  +  bg 


These  critical  values  are  shown  in  Figure  4-2. 
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LATENT  TRAIT  B 


FIGURE  4-2 

Two  Item  Response  Information  Functions  (Solid  Lines)  and  Item  Information  Function  (Dashed  Line) 
of  Each  of  Type  A,  B,  C  and  D  Models:  Logistic  Function  Is  Used  for  'fig(^)  with  D  =  1.7,  aa  =  1.00 
and  bg  =  0.00  .  These  Three  Curves  Overlap  for  Type  A.  Values  of  c9 1  and/or  Cg?  are  specified  for 

Types  B,  C  and  D. 
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FIGURE  4-2  (Continued) 


LATIMT  TRAIT  • 


Since  the  item  information  function  Lb  the  conditional  expectation  of  the  item  response  information 
function,  i.e.,  ^ 

(4.18)  /„(*)  =  £[/.,(*)  |tf]  , 


the  values  of  fig  and  6g  indicate  the  direction  of  the  information  loss.  Figure  4-3  exemplifies  the 
information  loss  for  various  values  of  cgX  and/or  of  cg2  for  Types  B,  C  and  D  . 


[IV.4]  Three-Parameter  Logistic  Model 

Since  three-parameter  logistic  model  is  one  of  the  most  widely  used  models  for  dichotomous  test 
items,  special  observations  have  been  made  for  this  model.  This  model  belongs  to  Type  B,  and  its  item 
characteristic  function  is  given  by 


(4.19)  P„(8)  =  cgi  +  (l-cgl)[l  +  ezp{-Dag(6-bg)}}  1  . 


We  also  have  for  the  total  information  Q  in  this  model 

(4.20)  Q  =  t  —  2tan_1[ — — — ]1/2  . 

1  -  cg  i 

When  cg  =  0.20  ,  Q  equals,  approximately,  0.705  *  ,  and  when  cg  =0.25  ,  it  is  approximately  0.667  r  . 


Since  four  and  five  are  the  most  commonly  used  numbers  of  alternative  answers  to  a  multiple-choice 
test  item,  the  square  root  of  the  test  information  function  1(6)  ,  which  is  given  by 


(4.21)  [/(0)]1/2  =  [£/»(*)]1/2  , 

»=i 

was  observed  for  each  of  the  two  cases  where  cgi  =  0.20  and  cg2  =  0.25  ,  respectively,  for  eleven  sets 
of  different  numbers  of  equivalent  items,  in  comparison  with  the  case  where  cg j  =  0  ,  i.e,  the  (two- 
parameter)  logistic  model.  These  results  were  also  compared  with  the  corresponding  results  obtained 
by  assuming  the  three-parameter  normal  ogive  model.  Standard  error  of  measurement  as  a  function  of 
ability  9  is  also  observed  for  these  different  sets  of  equivalent  items. 

[IV. 5]  Loss  in  Speed  of  Convergence  of  the  Conditional  Distribution  of 

the  Maximum  Likelihood  Estimate  to  the  Normality 

The  effect  o.  noise  caused  by  additional  parameters,  cg i  and  cg2  ,  is  naturally  found  in  the  loss 
of  the  speed  of  convergence  of  the  conditional  distribution  of  the  maximum  likelihood  estimate  6  , 
given  6  ,  to  the  normality.  This  was  observed  both  for  different  sets  of  equivalent  items  as  well  as 
those  of  non-equivalent  items.  It  was  discovered  that  the  effect  of  noise  is  substantial  in  decreasing  the 
convergence  speed. 
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FIGURE  4-3 


Square  Root  of  the  Item  Information  Function  for  Each  of  Several  Hypothetical  Items  of  Types  B,  C 
and  D  (Solid  Lines)  in  Contrast  to  the  One  Following  the  Logistic  Model,  with  D  =  1.7  ,  ag  =  1.00 
and  bg  =  0.00  (Dotted  Line).  Values  of  cgl  and/or  e„3  Are  as  Specified. 
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[IV.6]  Discussion 

The  principal  investigator’s  standpoint  is  that  we  should  try  to  eliminate  noise  by  constructing 
"good”  test  items,  since  noise,  which  may  be  caused  by  random  guessing,  or  by  some  other  factors,  is 
nothing  but  nuisance.  Its  undesirable  effect  is  probably  greater  than  most  researchers  think.  Because  of 
general  indifference  and  uncritical  acceptance  of  the  three-parameter  logistic  model,  however,  it  seems 
necessary  that  someone  should  quantify  the  effect  of  noise  incorporated  in  such  models.  The  effective 
use  of  the  critical  values  6^  and  6g  may  be  a  right  step  toward  the  solution. 

There  are  many  other  developments  and  observations  which  are  not  presented  here.  They  include, 
among  others,  observations  of  the  loss  of  accuracy  in  ability  estimation  caused  by  random  guessing  in 
the  three-parameter  logistic  model,  both  for  equivalent  items  and  non-equivalent  items. 
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V  A  Latent  Trait  Model  for  Differential  Strategies  in 
Cognitive  Processes 

One  of  the  main  objectives  of  this  research  project  is  to  "bridge”  psychometrics  with  cognitive 
psychology  through  the  advancement  of  latent  trait  theory.  With  the  rapid  progress  of  microcomputers 
in  the  past  decade  and  the  accompanied  decreases  in  their  cost,  many  scientific  investigations  which 
were  considered  practically  impossible  in  the  past  are  now  within  our  reach.  Thus  in  many  areas  of 
cognitive  psychology,  where  researchers  used  to  conduct  their  research  using  relatively  small  samples  of 
subjects,  we  can  plan  our  research  on  a  much  larger  scale.  Time  is  coming,  therefore,  that  latent  trait 
theory  will  find  its  way  to  contribute  to  the  progress  of  cognitive  psychology. 

Some  cognitive  psychologists,  who  have  tried  to  approach  psychometric  theories,  say  that  they  do 
not  provide  them  with  theories  and  methods  with  which  they  can  deal  with  differential  strategies.  They 
are  not  exactly  right,  however.  As  early  as  in  the  late  nineteen-sixties,  the  heterogeneous  case  of  the 
graded  reponse  level  in  the  context  of  latent  trait  theory  was  proposed  (Samejima,  1967)  as  a  model  for 
cognitive  processes.  Some  useful  hints  for  differential  strategies  are  also  seen  (Samejima,  1972,  Section 
3.4)  under  the  title  "Multi-correct  and  multi-incorrect  responses.” 

Following  the  same  line,  the  principal  investigator  proposed  a  general  latent  trait  model  for  differen¬ 
tial  strategies  in  cognitive  processes,  and  discussed  the  topics  intrinsic  in  the  model  (cf.  [1.1.3],  [1.1.4]). 
In  this  chapter,  the  outline  of  these  works  will  be  presented. 

[V.l]  Rationale  • 

The  model  deals  with  the  unidimensional  latent  space,  in  which  the  latent  trait,  or  "ability”,  9 
assumes  any  real  number.  Thus  we  can  write 

(5.1)  —  oo  <  9  <  oo  . 

Let  us  take  problem  solving  as  an  example.  Suppose  that  for  solving  the  problem  g  we  need  mg 
sequential  subprocesses.  Let  yg  denote  the  attainment  category  or  attainment  score.  One  must 
sucessfully  follow  all  the  mg  sequential  subprocesses  in  order  to  solve  the  problem  g  ,  so  the 
attainment  category  yg  assumes  integers,  0  through  (mg  +  1)  ,  with  yg  =  0  indicating  that  the 
individual  subject  has  successfully  followed  none  of  the  subprocesses,  and  with  yg  =  mg  meaning 
that  he  has  completed  all  mg  subprocesses  required  to  solve  the  problem.  The  additional  attainment 
score,  ( mg  +  1)  ,  indicates  that  the  subject  has  successfully  followed  the  additional  subprocess  which 
does  not  exist  but  is  hypothesized  at  the  end  of  the  entire  sequence  of  subprocesses.  Since  no  one  can 
accomplish  this,  the  conditional  probability,  given  9  ,  with  which  the  subject  obtained  the  attainment 
score  (m,  +  1)  equals  zero,  regardless  of  a  given  value  of  9  .  With  this  setting,  we  can  see  that  the 
general  graded  response  model  can  readily  be  applied  to  the  single  strategy  case  of  problem  solving.  Our 
main  objective  is,  however,  to  approach  a  general  model  for  the  multiple  strategy  case,  or  differential 
strategies,  in  the  context  of  latent  trait  theory. 

It  is  a  fairly  common  phenomenon  that  there  exist  more  than  one  way  of  solving  a  problem.  In 
proving  a  mathematical  theorem,  for  example,  we  often  find  one  proof  plus  several  alternative  proofs 
for  one  theorem.  Figure  5-1  presents  a  simple  example  of  a  two  strategy  case  in  the  form  of  a  graph , 
each  strategy  having  a  small  number  of  subprocesses.  In  this  example,  if  we  take  the  first  strategy  to 
solve  the  problem,  then  we  must  traverse  the  path,  Vo  vj  v2  03  V4  ,  whereas  we  must  follow  another  path 
u<)  uj  u2  U3  U4  U5  if  we  take  the  second  strategy.  (Note  that  vq  ~  uo  ,  t>i  =  ,  03  =  U4  and  V4  =  U5  .) 


When  the  subject  falters,  we  need  additional  arcs  in  the  digraph  presented  as  Figure  5-1.  Two 
examples  of  the  directed  subgraphs  which  represent  "faltering”  are  presented  in  Figure  5-2.  These  are 
rather  simple  examples  adding  one  cycle  to  each  path  included  in  Figure  5-1,  making  the  strategy  a  trail 
instead  of  a  path.  We  can  conceive  of  more  complex  examples,  however,  in  which  the  subject  traverses 
several  cycles  repeatedly  in  a  single  walk,  for  example. 

In  our  cognitive  process,  however,  we  often  choose  wrong  strategies  which  do  not  lead  to  the  solution 
of  the  problem  at  all.  Figure  5-3  illustrates  such  situations  in  which  hollow  circles  and  dashed  arcs  are 
added  to  imply  additional  paths  representing  wrong  strategies,  and  two  examples  of  such  unsuccessful 
strategies.  Even  if  the  subject  took  a  wrong  strategy,  he  may  become  aware  of  his  mistake  and  come 
back  to  a  previous  point  in  the  path  and  try  another  strategy.  Two  examples  of  such  trails  are  given 
in  Figure  5-4.  There  are  a  great  many  other  varieties  of  paths,  trails,  and  walks,  each  of  which  might 
represent  a  specified  subject’s  cognitive  process.  The  subject  may  walk  the  same  cycles  over  and  over 
again,  for  example,  or  he  may  stop  at,  say,  the  vertex  t>2  in  the  path  representing  the  first  successful 
strategy  in  Figure  5-4  and  then  may  not  proceed,  and  so  forth. 

It  is  obvious  that  following  those  cycles  illustrated  in  Figure  5-2  and  5-4  will  not  directly  improve 
the  subject’s  degree  of  attainment  toward  the  solution  of  the  problem.  Thus  we  can  more  or  less  ignore 
the  subject’s  traversing  on  cycles,  and  the  things  that  count  are  the  paths  in  those  graphs,  rather  than 
trails  or  walks  which  may  include  one  or  more  cycles.  This  implies,  for  instance,  that  the  first  trails 
in  Figures  5-2  and  5-4  are  treated  as  equivalent  to  the  completion  of  Strategy  2,  the  second  trails  in 
those  two  figures  are  equivalent  to  that  of  Strategy  1,  and  the  two  examples  of  unsuccessful  strategies 
in  Figure  5-3  are  equivalent  to  the  paths  «o  «i  U2  and  «o  «i  «2  «3  >  respectively. 

There  is  no  reason  to  assume  that  for  a  given  individual  of  trait  8  the  probability  of  success  stays 
the  same  when  he  chooses  different  successful  strategies.  Thus  the  probability  with  which  the  subject 
of  trait  6  solves  the  problem  using  the  first  successful  strategy  may  not  be  the  same  as  the  one  when 
he  uses  the  second  successful  strategy,  even  though  the  edge  (t>3  v4)  is  the  same  as  the  edge  (u4  u5) 


[V.2]  Differential  Strategy  Trees 

Figure  5-5  presents  both  the  successful  and  unsuccessful  strategies  discussed  in  our  example  in  the 
form  of  a  tree.  Note  that  not  only  are  the  two  points  t>4  and  U6  in  Figure  5-5  the  same  point  in 
Figure  5-3,  but  also  the  two  hollow  circles  marked  with  *  in  Figure  5-5  are  a  single  point  in  Figure  5-3, 
and  so  are  the  two  marked  with  **  .  We  shall  call  this  kind  of  tree  the  differential  strategy  tree.  It  is  a 
kind  of  directed  graph  which  contains  several  paths  representing  different  strategies,  joining  a  common 
initial  endpoint  with  the  distinct  other  endpoints.  We  call  this  initial  point  a  nothing  point,  indicating 
that,  "nothing  has  been  accomplished  yet,”  and  the  other  endpoints  for  the  successful  strategies  solution 
points,  meaning  that  the  "solution  has  been  reached.”  Since  no  one  can  surpass  a  solution  point,  it  also 
represents  a  hypothesized  attainment  score  which  no  one  can  obtain. 

Figure  5-6  presents  a  little  more  complicated  example  of  the  digraph  and  corresponding  differential 
strategy  tree,  in  which  only  successful  strategies  are  drawn.  Thus  in  our  second  example,  we  have  five 
successful  strategies  and  five  solution  points. 

[V.3]  A  General  Model  for  Differential  Strategies 

Let  w  denote  the  number  of  successful  strategies  for  solving  the  probelm  g  .  This  number  equals 
the  number  of  solution  points  in  the  differential  strategy  tree,  which  was  illustrated  in  the  preceding 
section.  Each  of  those  w  strategies  consists  of  mgi  (t  =  l,2,...,u>)  subprocesses,  and  they  are 
represented  by  the  vertices,  excluding  the  first  and  last,  both  in  the  digraphs  and  in  the  differential 
strategy  trees.  In  the  example  which  was  first  presented  and  discussed  in  the  preceding  section,  w  =  2  , 
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mg i  =  3  and  mg 2  =  4  and  the  subprocesses  are  represented  by  four  edges,  (vo  vi),(vi  v2),(v2  V3) 
and  { V3  ti4)  ,  in  the  path  v0  t>i  V2  V3  v4  representing  the  first  successful  strategy,  for  example.  Let 
ygi  be  the  attainment  score  indicating  the  degree  of  attainment  of  the  subject’s  performance  toward 
the  solution  of  the  problem  g  ,  which  takes  on  integers  0  through  m9<  when  the  subject  chooses  the 
strategy  1  .  Figure  5-7  presents  the  attainment  scores  assigned  to  separate  edges  of  the  differential 
strategy  tree  of  our  second  example. 


A  general  model  for  differential  strategies  concerns  the  assignment  of  an  operating  characteristic  to 
each  attainment  score  ygx  of  each  of  the  w  strategies  t  for  solving  the  problem  g  .  By  such  an 
operating  characteristic  we  mean  the  conditional  probability  with  which  the  subject  of  trait  8  chooses 
the  strategy  t  and  obtains  the  attainment  score  ygx  .  We  notice,  however,  that  in  general,  if  the 
subject’s  performance  stopped  before  branching,  there  is  no  way  to  decide  which  of  the  two  or  more 
strategies  he  would  have  taken.  For  example,  (si  S2)  and  (ti  t2)  in  Figure  5-6  are  a  single  edge,  and 
so  are  (t>i  v2)  and  (tuj  tu2)  •  Thus  we  must  assign  a  single  operating  characteristic  for  each  edge  of 
the  differential  strategy  tree.  Since  each  edge  represents  a  union  of  one  or  more  attainment  scores,  the 
operating  characteristic  is  to  be  assigned  to  each  union.  For  instance,  following  an  appropriate  model,  a 
single  operating  characteristic  will  be  assigned  to  the  union  of  y^  =  0  for  t  =  1, 2,  3, 4,  5  ,  and  the  same 
model  will  provide  us  with  an  operating  characteristic  solely  for  y94  =  3  .  For  convenience,  we  shall 
choose  the  smallest  t  in  each  union,  and  let  y*ti  denote  such  a  union  with  s  for  the  actual  attainment 
score.  In  example  2,  for  instance,  y*01  =  (yg  1  =  0)  U  (yg2  =  0)  U  (yo3  =  0)  U  (y„4  =  0)  U  (y„6  =  0)  ,and 
Vg34  =  (yg*  =  3)  .  and  none  of  the  unions,  yg02,  yg03,  yg04  and  y^5  exists. 


Let  Afy«ai  (8)  denote  the  conditional  probability  with  which  the  subject  of  trait  8  obtains  s  as 
his  attainment  score  in  one  of  the  strategies  which  belongs  to  y*ti  ,  with  the  joint  condition  that  he 
has  already  obtained  the  score  s  —  1  .  Since  there  is  no  preceding  attainment  score  for  y^  =  0  ,  and 
y*01  is  the  union  of  yfl,  =  0  for  all  the  w  strategies,  the  attainment  function  Mv*oi(9)  takes  on 
unity  throughout  the  whole  range  of  8  .  On  the  other  hand,  since  ygi  =  mgx  +  1  is  a  hypothesised 
attainment  score  which  is  higher  than  the  full  score  moi  ,  the  attainment  function  A/„-  (9) 

assumes  zero  for  the  entire  range  of  8  for  each  of  the  w  strategies.  Thus  we  can  write 


My-  (8) 


*  =  1,2 . w  , 


=  rnot  +  1 


Note  that  in  (5.2)  the  first  line  indicates  a  single  function  for  the  union  of  ygi  =  0  for  i  =  1,2, ...,  w  , 
while  the  second  line  indicates  w  separate  functions  for  t  =  1,2 . w  . 


Hereafter,  we  shall  assume  that  each  attainment  function  Mv-  (0)  is  three-times  differentiable 
with  respect  to  9  .  Note  that  this  assumption  does  not  contradict  (5.2). 


Let  P*.  (5)  be  the  conditional  probability  assigned  to  the  union  of  attainment  scores  y *  •  ,  with 
which  the  subject  of  trait  8  chooses  a  strategy  which  belongs  to  y*ti  and  obtains  the  attainment  score 
s  or  greater.  We  shall  call  this  function  the  cumulative  operating  characteristic  of  the  attainment  score 


union  y 


From  the  definitions  of  this  function  and  the  attainment  function  A fg>i(8)  ,  we  can  write 
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where  *i  indicates  the  closest  integer  less  than  or  equal  to  t  for  which  the  union  of  attainment  scores 
exists.  In  particular,  we  have 


(5.4) 


a  =  0 

a  =  1 

a  =  rrigi  +  1  . 


Note  that  the  first  line  of  (5.4)  indicates  a  single  function  for  the  union  of  j fa  =  0  for  i  =  1,2,. . .  ,w  ,  the 
second  line  indicates  one  or  more  functions  depending  upon  the  branching,  and  the  third  line  indicates 

the  w  separate  functions  for  i  =  1,2 . w  -  In  Example  1,  the  second  line  includes  two  functions, 

while  in  Example  2  it  includes  three  functions. 

Let  A*.  (5)  be  the  first  derivative  or  the  natural  logarithm  of  the  cumulative  operating  character¬ 
istic  P*.  (&\  ,  that  is, 


(5.5) 


-  3? 


Note  that  this  function  is  unchanged  when  P*.  (0)  is  multiplied  by  a  constant.  To  be  more  specific, 
let  'P(fl)  be  a  function  defined  by 


(5.6) 


=  ci  +  (c2  -  ci)*(^)  , 


where  0<ci<c2<l  .  The  firt  derivative  of  the  natural  logarithm  of  log  P*.  (9)  is  given  by 

■§q  log[ci  +  (c2  -  cj)^#)]  ,  which  equals  log^S)  if,  and  only  if,  c*  =  0  .  The  formula  (5.6)  has 
been  observed  in  a  somewhat  different  context  (cf.  (1.1.1))  and  these  observation  was  summarised  in 
Chapter  IV,  where  P*.  (5)  is  replaced  by  any  strictly  increasing  function  of  9  with  sero  and  unity  as 
its  two  asymptotes.  We  have  called  the  four  different  types  of  functions  derived  from  (5.6)  Types  A,  B, 
C  and  D  (cf.  Chapter  IV),  depending  upon  the  values  of  c i  and  c2  ,  i.e.,  the  function  is  of  Type  A 

when  0  =  Cj  <  c2  =  1  ;  of  Type  B  when  0  <  ci  <  c2  =  1  ;  of  Type  C  when  0  =  Cj  <  e2  <  1  ;  and  of 

Type  D  when  0  <  ci  <  c2  <  1  ,  respectively.  This  implies  that  the  cumulative  operating  characteristics 
of  Types  A  and  C  may  share  the  same  function  for  A*.  ( 9 ),  and  we  can  say  the  same  for  those  of 

Types  B  and  D.  The  necessary  and  sufficient  condition  tliat  My-  ,{6)  be  strictly  increasing  in  9  is 

that  the  inequality 


(5.7) 


Ay-  (0)  >  Ay-  ( 9 ) 


a  =  1,2,...,  (rrigi  +  l) 


holds  almost  everywhere  with  respect  to  9  . 

The  operating  characteristic,  P*.  (9)  ,  defined  for  the  union  of  the  attainment  scores  y*ti  is  given 
by 


g 


E- 


(5.8) 


where  Sy*  indicates  the  summation  over  all  the  strategies  j  branching  from  the  point  which  lies 
immediately  after  the  line  representing  */*„  . 

This  operating  characteristic  can  be  considered  as  the  likelihood  function  in  estimating  the  subject’s 
latent  trait  9  . 

When  there  are  more  than  one  problem  to  solve,  i.e.,  g  =  1,2 . n  ,  satisfying  the  conditional 

independence  of  the  attainment  scores  across  the  different  items,  given  6  ,  the  maximum  likelihood 
estimation  of  the  subject’s  latent  trait  can  be  performed  on  the  basis  of  the  response  pattern  V  ,  such 
that 

(5.9)  v'  =  (yit,,  y2*ji  *  *  *  i  ygts  >  ■■■>  y«»») 

for  the  n  problem  solving  tasks,  where  ig  is  a  strategy  for  solving  the  problem  g  and  is  the 

attainment  score  when  the  subject  chooses  the  strategy  ig  for  solving  the  problem  g  .  Let  Pv  (0) 
be  the  operating  characteristic  of  the  specific  response  pattern  V  .  We  can  write 

(5.10)  JV(*)  =  np*;.(W  * 

* 

where  indicates  the  multiplication  over  every  union  y*ti  to  which  an  element  of  V  belongs. 

It  is  beneficial  to  search  for  a  family  of  models  which  provide  us  with  a  unique  maximum  for  every 
possible  response  pattern  given  by  (5.9).  This  can  be  done  as  a  generalization  of  the  unique  maximum 
condition  proposed  for  the  graded  response  model  (cf.  Samejima,  1969,  1972). 

The  basic  function,  Ay-t.  (5)  ,  for  the  union  of  attainment  scores  y*,f  is  defined  by 

O'-11)  *»;.,(*)  =  fe  • 


The  maximum  likelihood  estimate,  9y  ,  of  the  subject’s  latent  trait  based  upon  his  response  pattern  is 
given  as  the  solution  of  the  likelihood  equation  such  that 

(5-12)  £  logjvw  =  py;.M) 

* 

=  E '*»;.. (*)  -  0  ■ 

* 

where  indicates  the  summation  over  every  union  y*ti  to  which  an  element  of  V  belongs.  A 
sufficient  condition  that  a  unique  modal  point  exists  for  the  likelihood  function  Py(d)  of  each  and 
every  response  pattern  V  is  that  this  basic  function  is  strictly  decreasing  in  9  with  non-negative  and 
non-positive  values  as  its  two  asymptotes,  respectively,  for  every  union  y*#i  .  This  can  be  shown  in 
the  same  way  that  we  did  for  the  basic  function  Axg{9)  of  the  graded  item  score  xg  (cf.  Samejima, 
1969).  For  brevity,  sometimes  we  call  this  condition  the  unique  maximum  condition. 
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Similarities  between  the  differential  strategies  in  problem  solving  and  the  multi-correct  responses 
in  testing  are  obvious.  If  we  consider  two  or  more  different  strategies  which  lead  to  the  solution  of 
the  problem  as  two  or  more  different  answers  to  a  question,  then  they  will  be  treated  as  multi-correct 
responses.  We  can  see  that  the  concept  of  multi-correct  responses  can  be  transfered  to  differential 
strategies,  when  there  exist  more  than  one  successful  strategy  in  solving  a  problem. 

[V.4]  Homogeneous  Case 

The  Homogeneous  case  of  the  graded  response  level  has  been  developed  and  discussed  (Samejima, 
1972)  as  a  generalization  of  a  family  of  models  on  the  dichotomous  response  level.  Sufficient  conditions 
that  a  model  provides  us  with  a  unique  modal  point  for  the  likelihood  function  of  each  and  every 
response  pattern  have  been  investigated.  In  the  homogeneous  case,  a  sufficient  condition  is  that,  for 
an  arbitrary  item  score  xg  0)  ,  the  cumulative  operating  characteristic  P*  (0)  is  of  Type  A,  i.e., 
strictly  increasing  in  0  with  zero  and  unity  as  its  two  asymptotes,  and  its  asymptotic  basic  function, 
Ax ff(0)  ,  which  is  defined  by 

(5-13)  AXl(9)  =  ±[log{±p;a(6)}\  , 

is  strictly  decreasing  in  8  .  The  satisfaction  of  this  sufficient  condition  also  implies  two  desirable  features 
of  the  model  such  that:  1)  the  operating  characteristic  of  each  graded  item  score  of  each  item  has  a 
single  modal  point,  and  2)  those  modal  points  for  a  single  item  are  arranged  in  the  same  order  as  the 
item  scores  themselves.  The  normal  ogive  and  logistic  models,  which  have  been  generalized  from  the 
corresponding  models  on  the  dichotomous  response  level,  are  two  examples  of  the  models  which  satisfy 
the  above  sufficient  condition. 

These  models  of  the  homogeneous  case  on  the  graded  response  level  can  be  generalized  to  provide 
us  with  those  which  belong  to  the  general  model  of  differential  strategies.  Let  ^(0)  be  a  function  of 
Type  A.  We  shall  consider  the  cumulative  operating  characteristic,  P*.  (0)  ,  of  the  union  of  attainment 
categories  y*ti  such  that 

<5'»)  *<*-«.;.,)  . 

where  ay*o.  is  negative  infinity,  a„«(m  +i).  is  positive  infinity  for  t  =  1, 2, ...,u/  ,  and  the  values  of 
a„-  are  ordered  in  the  same  way  as  those  of  a  for  every  strategy,  and  0y-  is  a  constant  which 
equals  unity  for  a  =  0  and  in  general  satisfies 

(515)  > 

with  indicating  the  summation  over  all  the  strategies  j  branching  from  the  point  of  the  differential 
strategy  tree  which  is  located  right  after  the  line  representing  the  union  .  FYom  (5.15)  it  is 

obvious  that,  as  far  as  there  is  no  branching,  .  =  /3v?(,_1)(  • 

A  sufficient  condition  that  the  model  satisfies  the  unique  maximum  condition  is:  1)  that  the  values 
of  the  constant  qv«  are  the  same  for  all  the  strategies  j  which  branch  from  the  vertex  located 
immediately  after  tlieedge  representing  y*ti  ,  and  2)  that  we  have 


almost  everywhere  in  the  domain  of  0  .  To  prove  this,  we  obtain  from  (5.8),  (5.14),  (5.15)  and  the 
definition  of  the  basic  function  Ay^.(6)  ,  which  was  given  by  (5.11), 

(5.17)  A„;J8)  =  log  Pgai(9) 

—  —  log[)9y*  Wld  —  a iu«  )  —  ¥(0  —  a„»  )]  . 

gg  01  “j..  '  yg  rya(»+i)j  '  s,9(«+i)i') 

i* 

where  ^  .  indicates  the  summation  over  all  the  strategies  j  branching  from  the  vertex  which  lies 
immediately  after  the  line  representing  the  union  ygti  .  By  virtue  of  the  first  condition,  we  can  rewrite 

(5.17)  in  the  form 

(5.18) 

We  notice  that,  if  we  replace  ygti  by  the  graded  item  score  xg  and  use  ^(0  —  aXg)  as  the  cumulative 
operating  characteristic  P^(0)  ,  the  last  form  of  (5.18)  is  identical  with  the  basic  function  of  the 

graded  item  score,  and  the  left  hand  side  of  (5.16)  is  identical  with  the  corresponding  asymptotic  basic 
function.  Thus  we  can  say  that  all  the  unions,  ygti  ,  are  equivalent  to  syndrome  response  categories  (cf. 
Samejima,  1972,  Section  5.2),  and  a  unique  maximum  is  assured  for  every  possible  response  pattern. 

If,  for  example,  ¥(0)  is  a  normal  ogive  function  or  a  logistic  distribution  function,  then  (5.16)  is 
satisifed  (Samejima,  1972,  Section  5.2),  and  we  can  develop  the  normal  ogive  model  and  the  logistic 
model  in  the  context  of  the  general  model  for  differential  strategies,  and  both  of  them  satisfy  the  unique 
maximum  condition.  In  these  two  models,  the  cumulative  operating  characteristics  are  defined  by 

(5.19)  p;.je)  =  Py‘t. (2X-)-1/2  r{e~bv>‘i]  '-«'!*  du 

9  J  —  oo 

and 

(5.20)  Pv;.S9)  =  £«;.,[ 1  +  exp{-Dag{6  -  }]-1  , 

respectively,  where  ag  (>  0)  is  the  discrimination  parameter  specific  for  each  problem  g  ,  6y*  is  a 
difficulty  parameter  defined  for  each  union  of  attainment  scores,  with  &„•  =  — oo  and  =  oo 

and  all  those  values  are  arranged  in  the  same  order  as  s  with  respect  to  each  strategy,  and  D  in 

(5.20)  is  a  scaling  factor  which  assumes  1.7  to  retain  the  same  set  of  parameter  values  as  those  in  the 
normal  ogive  model. 

Figure  5-8  presents  the  set  of  ten  cumulative  operating  characteristics  P*.  (0)  in  Example  1,  with 
the  parameter  values  such  that  ag  =  1.00,  by-ii  =  by*it  =  —2.50,  fcy*ji  =  —  l*b0,  by-)l  =  0.50, 
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FIGURE  5-8 


Cumulative  Operating  Characteristic  of  the  Union  of  (ytfl  =  0)  and  (y^  —  0)  (Solid  Line),  Those jof 
(y»i  =  1)  .  (y0i  =  2)  and  =  3)  ,  Respectively  (Dotted  Lines),  and  Those  of  ( y &  =  1)  , 

(y»a  =  2)  ,  (y„2  =  3)  and  (y„j  =  4)  ,  Respectively  (Dashed  Lines). 
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FIGURE  5-9 


Operating  Characteristic  of  the  Union  of  (yyl  =  0)  and  {yg2  =  0)  (Solid  Line),  Those  of  (ygl  =  1)  , 
(V«i  =  2)  and  (y9l  =  3)  ,  Respectively  (Dotted  Lines),  and  Those  of  (yg3  =  1)  ,  (y  2  =  2)  ,  (y^  =  3) 

and  (yg3  =  4)  ,  Respectively  (Dashed  Lines). 
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6yjaa-_i.30i  6y*ja  =  0.00,  6 y*4j  =  2.00,  Py*xl  =*  0.60  and  £y*i3  =  0.40  .  The  corresponding  operating 
characteristics  are  drawn  in  Figure  5-9,  in  w^iich  all  the  modal  points  except  for  the  negative  and  positive 
infinities  are  shown. 

[V.5]  Single  Strategy  Case 

In  the  single  strategy  case  where  there  is  only  one  successful  strategy  in  our  problem  solving,  things 
are  much  more  simplified.  There  exists  a  parallelism  with  the  graded  response  model  (Samejima,  1972), 
with  the  replacement  of  the  item  score  xg  by  the  attainment  score  yg  of  the  unique  successful  strategy. 

Let  A  *^(0)  be  the  first  partial  derivative  of  the  natural  logarithm  of  Pyg(6)  .where  Pg  {8)  equals 
P\  (0)  with  the  replacement  of  y*  by  the  single  set  of  attainment  score  yg  ,  such  that 

»  g si  '  • 


KM)  = 


yg  =  0,  l,...,(mff  +  1)  . 


It  has  been  shown  that  the  necessary  and  sufficient  condition  that  MVg  (0)  be  strictly  increasing  in  6 
is  that  the  inequality 


KM)  >  A(«,-o(*) 


=  i,  2 . K  +  i) 


holds  almost  everywhere  with  respect  to  8  (cf.  Samejima,  1967,  1972). 

In  the  homogeneous  case  Py}(9)  has  zero  and  unity  as  its  two  asymptotes  for  yg  =  1,2 . mg 

and,  furthermore,  we  can  write 


p:w  =  p;(o-<*r.)  , 


where  r  and  s  are  two  arbitrarily  selected  attainment  categories  with  r  <  s  ,  and  ar,  is  a  positive 
finite  constant.  We  obtain  from  (5.21)  and  (5.23) 


KV)  =  A* (8  —  ar„)  . 


Prom  (5.22)  and  (5.24)  it  is  obvious  that  a  sufficient,  though  not  necessary,  condition  that  Myg(9) 
be  strictly  increasing  in  8  for  yg  =  l,2,...,m„  is  that  Agg(8)  is  strictly  decreasing  in  8  for  an 
arbitrarily  chosen  attainment  category  out  of  1  through  mg  .  When  mg  tends  to  positive  infinity, 
and  ar,  for  two  adjacent  attainment  categories  tends  to  zero,  in  the  limiting  situation  this  condition 
becomes  the  necessary  and  sufficient  condition,  for  it  requires  that 
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for  any  small  positive  value  of  e  .  Note  that  this  condition  is  satisfied  whenever  the  uniaue  maximum 
condition  is  satisfied.  Above  a'l,  when  the  asymptotic  basic  function,  Ay*  (0)  ,  which  is  defined  by 


(5.26) 


<(»)  = 


is  strictly  decreasing  in  0  ,  not  only  A/y?(0)  of  each  subprocess  is  strictly  increasing  in  9  ,  however 
finely  differentiated  it  may  be,  but  also  a  unique  maximum  is  assured  for  the  likelihood  function  of 
each  and  every  possible  response  pattern  which  consists  of  such  attainment  scores  of  different  tasks  (cf. 
Samejima,  1972,  Sections  5.1  and  5.2).  It  has  been  shown  (Samejima,  1967,  1972),  for  example,  that  in 
the  normal  ogive  model  and  in  the  logistic  model  on  the  graded  response  level  (Samejima,  1969)  this 
condition  is  satisfied.  In  the  former  example,  we  can  write 

rag(e-byg) 

(5.27)  =  i2jr)~12  exp(—u2/2)  du 

J-oo 


and  in  the  latter 


(5.28)  P;t(9)  =  (1  +  exp{-Daa(e-bya)}}-1  , 

where  ag  is  the  item  discrimination  parameter  and  bUg  is  the  item  difficulty  parameter  for  the 
attainment  category  yg  ,  and  D  is  a  scaling  factor  which  is  usually  set  equal  to  1.7  (Birnbaum, 
1968).  In  both  models,  the  upper  asymptote  of  MVg  (0)  for  yg  =  1,  2, . . . ,  m,  is  unity,  while  the  lower 
asymptote  is  zero  in  the  normal  ogive  model  and  exp[— Dag(byg  —  6(ys_i))]  in  the  logistic  model.  This 
lower  asymptote  in  the  logistic  model  depends  upon  the  distance  between  the  difficulty  parameters  of 
the  two  adjacent  attainment  categories,  assuming  zero  for  yg  =  1  and  positive  numbers  less  than  unity 
otherwise.  In  both  models,  MVg{9)  for  yg  =  2,  3, . . . ,  mg  tends  to  unity  for  the  entire  range  of  9  as 
byg  approaches  f>(ys,-i)  ,  and  tends  to  Pyg[9)  as  bVg  departs  from  6(y|J_i)  (cf.  Samejima,  1972, 
Figure  5-2-1). 

This  simplified  model  will  be  useful  not  only  for  cognitive  processes,  but  also  for  paper- and-pencil 
testing,  provided  that  the  test  is  constructed  in  such  a  way  that  each  item  includes  only  one  successful 
path  to  reach  the  correct  answer.  Such  an  example  is  given  in  the  research  report  [1.1.3], 


[V.6]  Information  Provided  by  Differential  Strategies 

The  information  function,  /y>>((0)  ,  for  the  union  of  the  attainment  scores  ;/*„■  is  defined  by 


(5.29) 


where  Py •  (0)  is  the  operating  characteristic,  and  j4y-  (0)  is  the  basic  function,  of  y*„-  Respectively. 
This  function  is  non-negative  whenever  the  unique  maximum  condition  is  satisfied.  In  the  homogeneous 
case,  if  there  is  a  single  value  <*y't  common  for  all  the  strategies  j  ,  which  leads  to  the  satisfaction 
of  the  unique  maximum  condition,  then  we  can  write 


r: 


(5.30) 
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'*;..(*)  =  -^log{*(0-Qy;J-*(0-ay.(,+i).)} 
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For  the  item  information  function  Ig(9)  we  can  write 
(5-3i)  4W  =  E\iy.j6)\e\  =  £4 

y*  . 

’jn 

where  Y]  .  indicates  the  summation  over  all  the  unions  of  attainment  scores,  or  over  all  the  edges 
in  the  differential  strategy  tree.  It  is  obvious  that,  in  general,  the  more  subprocesses  we  have  within 
each  strategy  the  greater  amount  of  item  information  we  get,  with  the  continuous  subproceases  as  the 
limiting  case.  The  differentiation  of  strategies  itself  does  not  necessarily  increase  the  amount  of  item 
information,  however. 

When  we  have  n  problem  solving  tasks  which  require  the  same  latent  trait  9  ,  for  the  test 
information  function  1(9)  we  can  write 

n 

(5.32)  /(*)  =  £4(0)  • 

17=1 


[V.7]  Discussion 

A  question  may  arise  as  to  which  estimate  of  the  latent  trait  should  be  taken  if  the  subject  faltered 
from  one  strategy  to  another  and  did  not  reach  the  solution  of  the  problem.  One  answer  to  this  question 
may  be  to  take  the  attainment  score  of  the  strategy  that  he  took  last,  and  use  its  corresponding  operating 
characteristic  in  estimating  his  latent  trait.  Another  answer  may  be  to  compare  the  resultant  estimates 
of  9  obtained  by  the  separate  strategies  the  subject  has  taken  and  select  the  highest  estimate. 

The  usefulness  of  the  proposed  model  is  yet  to  discover.  We  need  the  collabortion  of  cognitive 
psychologists  who  are  willing  to  collect  data  on  larger  samples,  taking  advantage  of  modern  technologies. 
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VI  Latent  Trait  Models  for  Partially  Continuous  and 
Partially  Discrete  Responses 


A  set  of  latent  trait  models  were  proposed  during  this  research  period,  which  deals  with  the  mixture 
of  continuous  and  discrete  responses.  This  family  of  models  is  an  expansion  and  a  generalisation  of 
the  one  proposed  by  the  principal  investigator  in  1973  (Samejima,  1973),  in  which  the  open  response 
situation  is  dealt  with.  The  family  is  represented  by  the  closed  response  situation,  and  it  also  includes 
the  model  for  the  open  response  situation  as  a  special  case,  as  well  as  those  models  for  the  open/closed 
and  the  closed/open  response  situations. 


In  this  chapter,  the  outline  of  these  new  models  will  be  described,  and  one  separate  ongoing  research 
project  on  the  Rorschach  diagnosis  for  which  these  models  are  to  be  used  will  be  introduced  as  an 
example.  For  the  details  and  further  information  about  the  models,  see  [1.1.5]. 


[VI.  l]  Rationale 


Let  9  be  the  unidimensional  latent  trait,  or  any  hypothetical  construct,  which  assumes  all  real 
numbers.  Let  g  (=  1,2,  ...,n)  be  an  item,  which  is  the  smallest,  concrete  entity  devised  for 
measuring  the  latent  trait.  The  assumption  that  our  latent  space  is  unidimensional  implies  that  the 
conditional  or  local  independence  of  the  distributions  of  the  item  responses  of  separate  items,  given  9  , 
holds  in  the  unidimensional  latent  space. 


Distinction  between  the  open  response  situation  and  the  closed  response  situation  may  be  well 
illustrated,  schematically,  by  Figure  6-1.  Suppose  that  the  subject  is  asked  to  check  a  point  on  a  line 
segment  illustrated  in  Figure  6-1  in  accordance  with  his  judgment  required  for  the  task  in  item  g  . 
Without  loss  of  generality,  we  can  assign  the  item  score  zg  which  varies  zero  through  unity  for  each 
point  on  the  line  segment. 


It  will  be  reasonable  to  assume  that  the  probability  assigned  to  any  particular  point  on  the  line 
segment  is  nil,  provided  that  the  subject  is  not  allowed  to  check  either  of  the  two  endpoints.  We  cadi  it 
the  open  response  situation,  and  assumes  a  continuous  distribution  for  the  item  score  zg  .  If  the  subject 
is  allowed  to  check  either  of  the  two  endpoints  as  well  as  the  others,  however,  the  probability  assigned 
to  these  points  may  not  be  nil.  We  call  this  second  situation  the  closed  response  situation,  and  the 
distribution  of  zg  must  be  discrete  at  the  two  endpoints,  i.e.,  at  zg  =  0  and  zg  =  1  ,  and  continuous 
otherwise.  In  similar  manners  we  can  define  the  open/closed  response  situation  and  the  closed/open 
response  situation. 


[VI. 2]  Conditional  Distribution  of  the  Item  Score 


Let  Pis(9)  be  the  conditional  probability  with  which  the  subject  obtains  the  item  score  zg  or 
greater,  given  9.  A  general  mathematical  form  for  P*  (9)  in  the  homogeneous  case  of  the  continuous 
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FIGURE  6-1 

An  Example  of  the  Response  Formats  Which  Allow 
Continuous  Responses. 
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FIGURE  6-2 


Five  Hypothetical  Functional  Relationships  (Solid  Lines)  between  the  Continuous  Item  Score  zg  and 
the  Difficulty  Parameter  bt}  ,  Which  Are  Given  by  b,f  =  b o  +  (6i  —  bo)zg  ,  for  k  =  1,3, 5,7, 9  ,  with 
the  Parameters,  6o  =  —  2.0  and  6i  =  2.0  ,  and  the  Corresponding  Relationships  (Dotted  Lines)  Given 
by  b,  =  fci  -  (bi  —  feo) (1  —  za)k  ■  Closed  Response  Situation. 


lim  P;(8)=  0 

— ♦  —  OO  y 


lim  P'  (9)  =  1 

0—+  OO  W 


where  aff(>  0)  is  the  item  discrimination  parameter,  btg  is  the  item  response  difficulty  parameter,  and 
is  a  specific  continuous  function  which  characterizes  the  model,  and  is  positive  almost  everywhere. 
Two  examples  of  90(i)  are  the  normal  ogive  function  and  the  logistic  function,  which  are  specified, 
respectively,  by 


9„(t)  =  (2*)  1/2ezp(— 12/2)  , 


Vg(t)  =  D  exp(-Dt)  [1  +  exp(-Dt)\  2  , 


where  D  is  a  scaling  factor.  The  operating  density  characteristic,  Htg{9),  has  been  defined,  and  it 
can  be  written  in  the  form 


H,tf)  =  at*0{aa(9-b„m£ U  0  <  zg  <  1  . 


Let  PXg  (0)  be  the  conditional  probability  of  zg  ,  given  9  .  In  the  closed  response  situation,  we  can 
write 


f1  H,a(6)dzg  =  l-|/>o(*)  +  Pi(*)]  <1  , 
Jo 


where  Po{9)  and  Pi (9)  indicate  PJg(9)  for  zg  =  0  and  zg  =  1  ,  respectively.  We  can  also  write 
for  the  difficulty  parameter  bZg 


lim  6-  =  bo  >  — oo 

Zq—*0  9 


lim  6-  =6 1  < 

*s-.i  *» 


We  obtain  from  the  definitions  of  P*g  (9)  and  P,g  (5) 


p*M) 


=  QU9) 


Zg  =  0 


=  p:M) 


ZQ  —  1  > 


where 


(6.9) 


QUe)  =  L-p*(d) 


It  is  noted  that  in  the  closed  response  situation  inequality  holds  in  (6.6)  and  in  each  formula  of  (6.7) 
and  that  we  can  create  each  of  the  other  three  response  situations  by  setting  Po{8)  =  0  or  Pi(0)  =  0  , 
or  both. 

It  is  obvious  from  (6.5)  that  the  operating  density  characteristic,  HZg(d),  depends  heavily  upon  the 
relationship  between  the  item  score  zg  and  the  difficulty  parameter  bz  ,  as  well  as  on  the  functional 
formula  ¥„(.)  .  In  the  closed  response  situation,  the  relationship  between  zg  and  bZg  can  be  any 
strictly  increasing  function  including  the  linear  function,  with  the  constraint  that  the  values  of  bZg  are 
a  priori  specified  at  zg  =  0  and  zg  =  1  . 

For  practical  purposes,  it  may  be  appropriate  to  consider  various  strictly  increasing  polynomials  for 
approximations  to  such  functional  relationships.  In  such  approximations,  the  method  of  moments  for 
fitting  polynomials  will  be  a  usefull  tool  (cf.  Samejima  and  livingston,  1979).  We  can  write  for  a  set  of 
convex  polynomials 

k 

(6.10)  bZg  =  6o +  £>,-* 

y=i 

with  the  two  constraints, 


(6.11) 


oij  =  bi  -  b0 

y=i 


and 


(6.12) 


=  £ 


ay  3  zl  1 


y=i 


>  o 


0  <  Zg  <  1  , 


where  strict  inequality  holds  for  all  values  of  zg  between  zero  and  unity,  except,  at  most,  at  an 
enumerable  number  of  points.  A  sufficient,  though  not  necessary,  condition  for  the  second  constraint 
to  hold  is  that  a7  >  0  for  j  =  1, 2, . . . ,  k  .  We  notice  that  the  linear  relationship  holds  by  setting 
k  =  1  .  A  set  of  concave  polynomials  can  be  obtained  under  the  same  condition  by 

k 

b*g  =  &i  -  £>y(l  -  *„)y 

y=i 

with  the  same  constraints  given  by  (6.11)  and  (6.12).  Five  examples  of  each  of  the  two  sets  of  polynomials 
with  k  =  1,3, 5,7,9  ,  and  with  ag  =  0  for  j  <  k,  bo  =  —2.0  and  bo  =  2.0  are  drawn  by  solid  and 
dotted  lines  in  Figure  6-2,  respectively. 


(6.13) 


In  contrast  to  the  observations  made  so  far  in  the  closed  response  situation,  neither  in  the  closed/open 
response  situation  nor  in  the  open/closed  response  situation  can  the  functional  relationship  between  the 
item  score  zg  and  the  difficulty  parameter  bZg  be  linear,  nor  can  it  be  approximated  by  a  polynomial. 
One  suitable  formula  in  the  closed/open  response  situation  may  be 


(6.14) 


bM,  =  b0  +  tan[(*/2)£(zfl)] 


where  £(z„)  is  any  strictly  increasing,  continuous  function  of  za  defined  for  0  <  za  <  1  ,  with  the 
constraint 


zg  =0 


Zg  =  1 


Two  examples  of  £(za)  are  given  by  polynomials  such  that 


£(zs)  =  X)  a>  za 


£(2o)  =  Z°Y  > 

y=i 


with  the  constraints  given  by  the  right  hand  inequality  of  (6.12)  and 


]Cai  =  1  • 


Figure  6-3  presents  by  solid  curves  five  examples  of  the  above  functional  relationship  with  (6.16)  as 
£(za)  and  with  60  =  -2.0  ,  where  k  =  1,3, 5, 7,9  and  ay  =  0  for  j  <  k  .In  the  same  figure,  also 
presented  by  dotted  curves  are  the  corresponding  five  examples  of  (6.14),  in  which  £(za)  is  specified 
by  (6.17)  with  the  same  set  of  parameter  values. 


Figure  6-4  illustrates  by  a  solid  curve  the  operating  density  characteristic  Htf[9)  in  the  normal 
ogive  model  as  a  function  of  the  continuous  item  score  zg  ,  for  each  of  the  four  fixed  values  of  9  i.e., 
-3.0,  -2.5,  -2.0  and  0.0,  with  the  parameters,  aa  =  1.0  and  60  =  —2.0  ,  using  (6.14)  as  the  functional 
relationship  between  zg  and  bXj  with  the  specification  of  £(za)  by  (6.16)  in  which  k  =  1  and 
oil  =  1-9  ■  In  the  same  figure,  also  presented  by  dotted  curves  are  the  corresponding  operating  density 
characteristics  in  the  logistic  model,  in  which  D  =  1.7  . 


Similarly,  in  the  open-closed  response  situation,  one  useful  formula  for  the  relationship  between  the 
continuous  item  score  zg  and  the  difficulty  parameter  may  be 


b ,,  =  &i  +  tan[(-7r/2k(2o)l  > 


where  f(z„)  is  any  strictly  decreasing,  continuous  function  of  zg  defined  for  0  <  za  <  1  ,  with  the 
constraint 
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FIGURE  6-3 


Five  Hypothetical  Functional  Relationships  (Solid  Lines)  between  the  Continuous  Item  Score  zg  and 
the  Difficulty  Parameter  bMf  ,  Which  Are  Given  by  b,a  =  i>o  +  tan((w/2)«^]  for  k  =  1,3, 5, 7, 9  ,  with 
the  Parameter,  6o  —  —2.0  ,  and  the  Corresponding  Relationships  (Dotted  Lines)  Given  by 


b,}  =  fco  +  tan[(ir/2)(l  —  (1  —  zg)*}}  •  Closed/Open  Response  Situation. 
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FIGURE  6-4 


Operating  Density  Characteristic,  H,}(6)  ,  for  Each  of  the  Four  Values  of  9  ,  i.e., 
-3.0,  -2.5,  -2.0  and  0.0,  Following  the  Normal  Ogive  Model  with  ag  =  1.0  and 
b„  =  —2.0  ,  with  btt  =  &o  +  [tan[(*-/2)z„]  ,  Represented  by  a  Solid  Curve. 
Corresponding  Four  Functions  Following  the  Logistic  Model  with  D  =  1.7 
Are  Also  Drawn  by  Dotted  Curves.  Cloeed/Open  Response  Situation. 


Again,  for  practical  purposes,  it  may  suffice  if  we  consider  polynomials  such  that 

k 

(6.21)  ?(*,)  = 

y=i 

or 

k 

(6.22)  ({*,)  =  1  . 

y=i 

where  k  is  the  degree  of  polynomial  and  a,  (j  —  1,2 . A;)  is  a  coefficient,  with  the  constraints 

given  by  the  right  hand  inequality  of  (6.12)  and  (6.18).  If  we  set  =  2.0  and  adopt  the  parameter 
values  that  we  used  in  the  examples  illustrated  in  Figure  6-3  for  the  closed/open  response  situation, 
the  functional  relationships  given  by  (6.19)  with  (6.21)  and  (6.22)  for  f(z9)  provide  us  with  the  set 
of  curves  obtained  by  rotating  those  of  Figure  6-3  by  one  hundred  and  eighty  degrees,  keeping  the 
unit  of  the  ordinate  and  changing  the  upper  asymptote  of  the  curves  to  +2.0  .  Also  by  rotating  the 
curves  in  Figure  6-4  in  the  three  dimensional  space  as  zg  =  0.5  as  the  axis  of  rotation  we  obtain  the 
corresponding  examples  of  the  operating  density  characteristics  of  the  open/closed  response  situation 
for  9  =  0.0,  2.0,  2.5  and  3.5  ,  respectively. 

[VI. 3]  Parametric  and  Nonparametric  Estimations  of  Operating  Density 

Characteristics 

In  the  parametric  estimation  of  the  operating  density  characteristic,  some  appropriate  model  must 
be  selected  first,  and  then  the  estimation  is  reduced  to  that  of  the  item  parameters  that  belong  to  the 
specific  model.  Thus,  in  the  parameter  estimation,  model  validation  at  the  end  of  each  stage  of  research 
is  a  necessary  and  important  procedure.  We  notice  that,  in  the  parametric  approach,  we  can  always 
reduce  the  data  based  upon  the  continuous  response  level  to  those  based  either  upon  the  graded  response 
level  or  upon  the  dichotomous  response  level,  by  categorizing  the  continuous  responses  into  appropriate 
discrete  response  categories.  Thus  those  methods  developed  for  the  item  parameter  estimation  on  the 
dichotomous  response  level  (e.g.,  Lord,  1952,  Bock  and  Aitkin,  1981)  and  their  variations  developed  for 
the  graded  response  level,  are  directly  applicable  in  estimating  the  item  parameters  of  the  operating 
density  characteristics.  To  be  more  specific,  by  adopting  an  appropriate  set  of  values  of  zg  ,  we  shall 
be  able  to  obtain  the  corresponding  set  of  estimated  values  of  bXg  ,  and  then  by  an  appropriate  curve 
fitting  we  shall  be  able  to  obtain  the  estimated  difficulty  parameter  function.  Since  our  data  on  the 
continuous  response  level  contain  more  information,  in  so  doing  we  can  also  conduct  a  model  validation 
study,  if  we  design  our  research  appropriately. 

In  the  non-parametric  estimation  of  the  operating  density  characteristics,  we  assume  no  mathematical 
forms  a  priori.  In  this  direct  approach,  again  we  can  reduce  our  data  to  those  which  are  based  upon  the 
graded  response  level,  and  those  non-parametric  methods  developed  for  discrete  responses  (e.g.,  Levine, 
1980,  Samejima,  1977,  1981,  and  cf.  Chapter  II)  can  be  applied.  If  we  have  Old  Test,  or  a  set  of  items 
whose  operating  characteristics  or  operating  density  characteristics  are  already  known,  the  application 


of  the  techniques  will  be  straightforward.  When  there  is  no  Old  Test,  we  can  select  a  certain  subset  of 
items  having  high  content  validity  out  of  all  the  items  in  our  research,  and  use  this  subset  in  place  of  the 
Old  Test.  In  so  doing,  we  may  assume  several  different  models  for  our  "Old  Test*  items,  estimate  the 
item  parameters  using  suitable  parametric  methods,  validate  or  invalidate  each  model,  and  select  the 
model  which  has  the  highest  validity.  We  may  end  up  with  selecting  different  models  for  different  items. 
In  such  a  case,  as  far  as  each  model  satisfies  the  unique  maximum  condition  (Samejima,  1969,  1972, 
1973),  we  can  still  obtain  the  maximum  likelihood  estimate  of  the  subject’s  latent  trait,  or  individual 
parameter,  by  using  the  basic  functions  (Samejima,  1969,  1973)  based  upon  those  separate  models. 

In  the  half-open  and  half-closed  response  situations,  or  in  the  closed  response  situation,  there  is 
another  method,  which  is  a  combination  of  the  parametric  approach  and  the  non-  parametric  approach. 
In  the  first  pair  of  situations,  we  can  reduce  our  data  to  those  on  the  dichotomous  response  level  by  using 
the  endpoint  with  a  non-zero  probability  as  one  category,  and  recategorizing  all  the  other  continuous 
responses  as  the  other  discrete  category.  We  can  use  all  the  items  thus  dichotomized  as  the  Old  Test, 
searching  a  suitable  model,  or  models,  in  the  same  way  described  in  the  preceding  paragraph.  In  the 
closed  response  situation,  we  can  trichotomize  all  the  responses,  using  both  endpoints  as  the  lowest  and 
highest  categories  and  all  the  continuous  responses  as  the  intermediate  category,  and  follow  the  same 
procedure.  We  can  also  conceive  of  many  other  variations,  depending  upon  the  points  where  responses 
are  discrete. 

The  main  difference  between  this  new  method  and  the  preceding  one  is  that  in  the  new  method  we 
make  use  of  all  the  items  used  in  our  research  as  the  Old  Test,  while  in  the  other  only  a  subset  of  items 
is  used.  In  general,  the  choice  of  a  method  should  depend  upon  the  nature  of  our  data,  including  the 
configuration  of  the  characteristics  of  our  items,  the  sample  size  of  subjects,  and  so  forth. 

[VI.4]  Estimation  of  the  Individual  Parameters  of  Subjects 

When  the  item  parameters  are  known,  or  well  estimated,  the  estimation  of  the  individual  parameter, 
or  the  point  of  the  latent  trait  6  at  which  the  subject  is  located,  can  be  performed  through  the 
maximum  likelihood  estimation.  If  a  simple  sufficient  statistic  for  the  response  pattern  V  such  that 


(6.23) 


^  (zl>  *2»  '■'>  zgi  Zn)  . 


does  not  exist,  as  is  the  case  with  most  models,  we  will  use  the  basic  function  (Samejima,  1969,  1972, 
1973)  and  follow  the  numerical  process  to  obtain  the  maximum  likelihood  estimate  By  for  each  response 
pattern  V. 

We  can  write  for  the  general  form  of  the  basic  function  in  the  closed  response  situation 


(6.24) 


Zg  =  0 


Azg(Q)  \  =  [’d0^g{ao{^  “  ^*9)}][^ff{aa(0  —  ^**)}]  1  0  <  Zg  <  1 


l  —  °17  ^g{aff(0  &l)}  [^1(0)]  1 


Zg  -  1 


It  has  been  shown  (Samejima,  1972)  in  a  somewhat  different  context  that,  if  follows  one  of 


the  formulae  such  as  (6.3)  and  (6.4),  those  three  functions  in  (6.24)  are  strictly  decreasing  in  8  ,  the 
fact  that  leads  to  the  unique  maximum  for  the  likelihood  function  Ly  (#)  for  each  and  every  response 
pattern  V  . 

(6.24)  also  includes  the  basic  functions  of  all  the  other  three  response  situations,  i.e.,  they  are  realized 
by  excluding  the  line  in  (6.24)  corresponding  to  each  open  endpoint.  The  same  rule  applies  for  the  item 
response  information  function  which  will  be  discussed  later  in  this  chapter. 

In  the  normal  ogive  model,  which  is  characterized  by  (6.3),  the  basic  function  takes  the  form 

=  — (2ir)-1/2  ag  exp[—ag(8  -  60)2/2]  [Q5(0))_1  zg  =  0 
(6.25)  A,t(6)i  =  -^(8-bxJ  0  <  zg  <  1 

.  =  (2ir)-1/2  a,  txp\-a?g(8  -  6i)2/2]  [P^)]-1  zg  =  1  . 


This  function  is  strictly  decreasing  in  8  for  all  the  values  of  zg  ,  and,  in  particular,  for  0  <  zg  <  1  it 
is  a  linear  function  with  the  slope  —a2  which  intercepts  the  abscissa  at  8  =  bXg  .  The  two  asymptotes 
of  this  basic  function  are  zero  and  negative  infinity  for  zg  =  0  ,  positive  and  negative  infinities  for 
0  <  zg  <  1  ,  and  positive  infinity  and  zero  for  zg  =  1,  respectively.  For  the  basic  function  in  the  logistic 
model,  which  is  specified  by  (6.4),  we  have 


(6.26) 


=  -Do,  Pq(8) 

*;(*){  =  Doa[l-2P;a(8)\ 

=  Da,  QW) 


zg  =  0 


0  <  Zg  <  1 


*0  =  1- 


We  can  see  that  this  is  also  strictly  decreasing  in  8 
every  item  score  zg  .  It  is  not  a  linear  function  for 


throughout  the  entire  range  of  6  for  each  and 
0  <  zg  <  1  ,  however,  although  it  also  intercepts 


the  abscissa  at  8  =  bx  .  The  two  asymptotes  of  the  basic  function  are  zero  and  —Dag  for  zg  =  0 


Daa  and  —Daa  for 


0  <  zg  <  1 


and  Da,  and  zero  for  z,  =  1  ,  respectively. 


The  upper  graph  of  Figure  6-5  illustrates  five  examples  of  the  operating  density  characteristic, 
HXg  (8)  ,  of  the  continuous  item  response  zg  in  the  normal  ogive  model  with  the  item  parameters 
ag  =  1.0,  bo  =  —2.0  and  bi  =  2.0,  for  zg  =  0.1,  0.3,  0.5,  0.7,  0.9  in  the  closed  response  situation, 
where  the  difficulty  parameter  bXg  is  given  as  the  linear  function  of  zg  .  In  the  same  graph,  also 
presented  by  dashed  lines  are  those  in  the  two  limiting  cases  where  zg  tends  to  zero  and  unity, 
respectively.  The  corresponding  five  operating  characteristics  and  those  in  the  two  limiting  cases  in  the 
logistic  model  with  the  same  set  of  item  parameters  and  the  scaling  factor,  D  =  1.7,  are  shown  in  the 
lower  graph  of  Figure  6-5. 


It  should  be  recalled  (Samejima,  1973,  1974)  that  a  sufficient  statistic,  t(V)  =  Y^x  gv  >  exi*t8 

the  normal  ogive  model  in  the  open  response  situation.  It  is  not  the  case  with  the  otlier  three  response 
situations,  however,  which  include  zg  =  0  or  z,  =  1  ,  or  both,  although  we  can  still  use  t(F)  defined 
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FIGURE  6-5 

Operating  Density  Characteristic,  ,  as  a  Function  of  6  for  Each  of  the  Five  Values  of  the 

Item  Score,  0.1,  0.3,  0.5,  0.7  and  0.9  ,  Following  the  Normal  Ogive  and  the  Logistic  Models,  with 
ag  =  1.0  ,  bo  =  —2.0  ,  6i  =  2.0  and  D  =  1.7  ,  When  the  Linear  Relationship  Holds  between  the 
Item  Score  zg  and  the  Difficulty  Parameter  b,9  .  The  Additional  Two  Curves  Are  Those  in  the 
Limiting  Situations  Where  zg  Tends  to  Zero  and  Unity,  Respectively.  Closed  Response  Situation. 
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above  for  any  response  pattern  which  does  not  include  either  tero  or  unity  as  its  elements.  It  is  also 
recalled  (Birnbaum,  1968)  that  a  sufficient  statistic,  t*(V)  =  ey  agug  >  exists  in  the  logistic  model 
on  the  dichotomous  response  level  where  zg  is  replaced  by  the  binary  item  score  ug  .  Although  the 
basic  functions  for  zg  =  0  and  zg  =  1  shown  in  (6.26)  are  identical  with  the  corresponding  functions 
for  ug  =  0  and  ug  =  1  on  the  dichotomous  response  level  with  the  replacement  of  (0)  by  P*  (5)  ,  a 
simple  sufficient  statistic  does  not  exist,  even  though  t*(V)  can  be  used  for  any  response  pattern  which 
solely  consists  of  0  and  1  .  In  general,  the  maximum  likelihood  estimation  of  the  individual  parameter 
must  be  conducted  numerically  through  the  basic  functions  for  each  response  pattern. 

For  the  item  response  information  function,  IX}  (8)  ,  we  can  write 
(6.27)  I„(9)  =  -^P'A<>)  =  -£oA'A0) 


'  =  -  b0)})Q*0(d)  +  aB(*9{afl(0  -  M})2][W)]-2  =  0 

=  -  6.,)}  +  (&¥*{«*(*  -  M»21 

0<zg<i 

.  =  -  6i)})Px*(0)  -  ag(9g{ag(6  -  M})2]^)]-2  zg  =  1  . 


In  the  normal  ogive  model,  this  takes  the  form 


*2g*g{*g(9  -  fcoJM-M*  -  b0)Q-0(9)  +  *g{ag(9  -  60)}] 

twr2 


(6.28)  I,  (9)1 


Zg  =0 


0  <  Zg  <  1 


=  al*g{ag(9  -  M}[a0(*  -  bl)P‘1(9)  +  *g{ag(9  -  fc)}] 

[p;(5)i-2 


Za=l 


and  in  the  logistic  model,  we  obtain 


(6.29) 


'  =  02a2  PZ(9)QW)  ^  =  0 

=  2 D*al  P;g{9)  Q'"(B)  0  <  zg  <  1 

=  D*alP;(9)Ql(9)  zg=  1. 
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In  each  of  the  two  models  this  item  response  information  function  is  positive  throughout  the  entire 
range  of  0,  and  the  unique  maximum  condition  is  satisfied.  Figure  6-6  presents  the  item  response 
information  function  in  the  normal  ogive  model  in  the  upper  graph  and  the  one  in  the  logistic  model  in 
the  lower  graph  with  the  same  set  of  parameter  values  and  the  scaling  factor  that  we  used  in  Figure  6-5 
for  zg  —  0.0,  0.1,  0.3,  0.5,  0.7,  0.9,  1.0  ,  when  the  functional  relationship  between  zg  and  bXg  is 
linear.  We  can  see  from  (6.28)  that  in  the  normal  ogive  model  the  horizontal  line  in  the  upper  graph 
of  Figure  6-6  indicates  the  item  response  information  function  for  each  and  every  value  of  zv  in  the 
interval  (0,1),  so  this  includes  the  five  cases  where  zg  =  0.1,  0.3,  0.5,  0.7,  0.9  .  Those  in  the  two 
limiting  situations  where  zg  tends  to  zero  and  unity,  respectively,  are  also  drawn  by  dashed  lines. 

The  item  information  function,  Ig[8)  ,  is  defined  as  the  conditional  mean  of  the  item  response 
information  function,  given  0  (Samejima,  1969,  1972,  1973),  for  which  we  can  write 

(6.30)  I„(d)  =  /o(0)[l  -  P0*(«)l  +  f  IXg(8)Htg(e)dza  +  h(8)  P;(0)  , 

Jo 

where  4(0)  and  4(0)  indicate  the  item  response  information  function  4,(0)  for  zg  =  0  and 
zg  =  1,  respectively.  This  function  is  drawn  by  a  dotted  line  in  each  graph  of  Figure  6-6. 

Figure  6-7  illustrates  the  operating  density  characteristics  HXg(8)  in  the  normal  ogive  and  logistic 
models  in  the  closed/open  response  situation,  with  the  item  parameters  ag  =  1.0  and  bo  =  —2.0,  and 
the  scaling  factor  D  =  1.7  in  the  latter  model,  for  the  five  selected  item  scores,  0.1,  0.3,  0.5,  0.7,  and 
0.9  .  The  difficulty  parameter  function  adopted  here  is  shown  in  Figure  6-3  as  the  solid  curve  marked 
with  k  =  1  .  As  was  observed  in  the  closed  response  situation,  this  operating  density  characteristic 
is  proportional  to  ’!'„(.)  with  a^1  as  the  dispersion  parameter  and  bXg  as  the  location  parameter 
with  ag{-^b2g)  as  the  ratio  of  proportionality.  Since  in  this  example  the  derivative  of  the  difficulty 
parameter  function  is  given  by  (7r/2)  sec2 j(7r/2)z„]  and  it  increases  with  zg  ,  the  area  under  the 
curve  of  HXg(8)  in  Figure  6-7  increases  as  zg  does,  both  in  the  normal  ogive  and  logistic  models. 
In  fact,  the  area  approaches  infinity  as  zg  tends  to  unity  and,  therefore,  bXg  tends  to  infinity,  the 
tendency  that  is  hinted  by  the  truncated  curves  for  HXg(6)  for  zg  =  0.9  in  the  two  graphs  of  Figure 
6-7.  On  the  other  hand,  when  the  continuous  item  score  zg  tends  to  zero  and,  therefore,  bXg  tends 
to  b0  ,  the  ratio  of  proportionality  approaches  (ir/2)ag  ,  and  this  limiting  case  of  HXg(9)  is  shown  by 
a  dashed  curve  in  Figure  6-7  in  each  of  the  normal  ogive  and  the  logistic  models.  The  areas  under  the 
curves  for  the  same  value  of  zg  across  the  two  graphs  of  Figure  6-7  are  equal. 

Figure  6-8  presents  the  item  response  information  function  IXg  (0)  by  solid  lines  and  the  item 
information  function  Ig(8)  by  a  dotted  line  in  each  of  the  normal  ogive  and  the  logistic  models,  with 
the  same  parameters,  scaling  factor,  difficulty  parameter  function  and  fixed  values  of  zg  as  were  used 
in  Figure  6-7,  together  with  the  limiting  case  of  IXg(8)  where  zg  tends  to  zero,  which  is  drawn  by  a 
dashed  line. 

Similar  observations  were  also  made  for  the  open/closed  response  situation,  which  will  not  be  pre¬ 
sented  here  because  of  the  shortage  of  space. 

[VI. 5]  Prospect  of  Adopting  These  Models  for  Rorschach  Diagnosis 

This  phase  of  advancement  of  latent  trait  theory  dealing  with  partially  continuous  and  partially 
discrete  responses  has  enhanced  the  opportunity  of  applying  the  theory  for  cognitive  processes  further. 
For  one  thing,  the  model  for  the  closed/open  response  situation  is  readily  applicable  for  the  response 
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FIGURE  6-6 

Item  Response  Information  Functions,  /Xj(0)  ,  (Solid  Lines)  and  Item  Information  Function,  Ig{0)  , 
(Dotted  Line)  in  the  Normal  Ogive  and  the  Logistic  Models,  with  ag  =  1.0  ,  b0  —  —2.0  ,  bx  =  2.0 
and  D  =  1.7  .  In  the  Normal  Ogive  Model,  the  Horiiontal  Line  Indicates  Common  (#)  for  All  Item 
Scores,  0  <  zg  <  1  ,  While  in  the  Logistic  Model  the  Five  Curves  Identical  in  Shape  Indicate  /*,(0) 
for  zg  =  0.1, 0.3, 0.5, 0.7, 0.9  ,  When  the  Functional  Relationship  between  zg  and  bMf  is  Linear,  with 
the  Two  Dashed  Curves  as  Those  in  the  Limiting  Situations  When  zg  Tends  to  Zero  and  Unity, 

Respectively.  Closed  Response  Situation. 
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latency,  which  must  have  a  discrete  category  of  "too  much  delay”  or  "too  slow  a  response.”  There  are 
many  more  conceivable  applications  of  these  models,  even  for  kinds  of  cognitive  processes  which  have 
never  been  attempted  to  be  analyzed  by  psychometric  methods. 


As  one  of  such  ambitious  research  projects,  in  1985  the  principal  investigator  started  exploring  the 
possibility  of  analyzing  the  clinicians’  diagnosis  based  upon  the  Rorschach  Test.  At  first  she  started 
discussing  the  idea  with  one  of  her  colleagues  and  the  director  of  the  Clinical  Psychology  Program  of 
the  University  of  Tennessee,  Dr.  Alvin  Burstein.  Then  starting  in  April,  1986,  including  a  young  Ph.D. 
Scott  Glass,  the  three  researchers  have  met  regularly  and  discussed  the  new  research  prospect. 

During  this  period  it  was  decided  that  we  pursue  the  diagnosis  of  intellectual  aspect  of  patients 
as  the  starter.  Although  the  diagnosis  through  Rorschach  Test  is  basically  for  pathological  aspects, 
it  is  common  for  <  linicians  to  consider  the  intellectual  aspect  of  each  patient  when  they  decide  the 
therapy,  regardless  of  the  specific  pathological  problem  the  patient  has.  In  so  doing  clinicians  tend  to 
put  more  importance  upon  their  own  diagnosis  through  the  Rorschach  Test  than  the  information  given 
by  so-called  intelligence  tests  such  as  WAIS,  WISC,  etc.  Thus  the  diagnosis  upon  the  intellectual  aspect 
through  the  Rorschach  Test  may  be  more  useful  and  suitable  for  us  to  pursue  as  the  starter,  before  going 
into  specifics  such  as  schizophrenics,  neusotics,  etc.  Dr.  Glass  took  initiatives  in  the  preliminary  study, 
making  various  frequency  distributions  based  upon  243  subjects,  and  also  upon  randomly  selected 
six  subjects  out  of  the  more  intellectual  subgroup,  which  consists  of  68  undergraduate  students  in  the 
College  Scholar  Program  of  the  University  of  Tennessee,  and  also  upon  six  subjects  out  of  the  less 
intellectual  subgroup,  i.e.,  42  foster  care  children  of  twelve  to  eighteen  years  of  age.  Approximately 
eighty  percent  of  the  subjects  of  this  second  subgroup  have  70  to  80  IQ  scores  measured  by  the 
Peabody  Picture  Vocabulary  Test. 

In  selecting  items,  the  main  task  of  the  principal  investigator  has  been  to  listen  to  the  two  clinicians 
who  are  asked  to  self  analyze  their  ways  of  diagnosing  patients  in  their  intellectual  aspect  through  the 
Rorschach  Test,  and  also  with  theoretical  considerations  to  decide  the  most  appropriate  way  of  scoring 
each  item.  This  has  been  done  repeatedly  over  the  years.  Special  care  has  been  taken  to  avoid  using 
overlapping  information  in  defining  items  and  their  separate  scoring  strategies,  while  taking  as  much 
effort  as  possible  to  preserve  and  simulate  the  actual  diagnosis.  Recently,  Dr.  Sandra  Loucks  also  joined 
our  group,  and  she  saw  our  tentative  conclusions  in  item  selection  and  scoring  strategies  critically  and 
suggested  additions  and  changes.  Also  Dr.  Allen  Rosenwald  joined  our  discussion  at  one  time  by  our 
invitation. 

Now  we  have  reached  the  stage  that  we  feel  comfortable  with  our  selection  of  items  and  their 
separate  scoring  strategies,  from  both  the  clinicians’  and  the  psychometrician’s  standpoint.  Appendix 
B  presents  these  results.  For  each  item,  the  model  which  is  considered  most  suitable  is  written,  in 
addition  to  the  content  and  the  scoring  strategy  of  the  item.  As  we  can  see  in  this  table,  these  models 
are  the  open/closed  response  model,  the  closed  response  model  and  the  graded  response  model. 

This  separate  research  project  is  still  in  progress  and  it  will  take  a  long  time  before  we  get  the 
results,  for  we  are  still  in  the  process  of  obtaining  more  Rorschach  data  and  of  rescoring  each  protocol 
following  our  definition  of  items  and  their  separate  scoring  strategies.  The  prospect  of  the  success  in 
this  research  project  seems  to  be  good,  however,  due  to  the  new  family  of  latent  trait  models  proposed 
in  the  present  research  and  summarized  in  this  chapter. 

[VI. 6]  Discussion 

This  proposal  of  a  new  set  of  latent  trait  models  may  be  one  of  the  biggest  accomplishments  during 
this  research  period.  One  objective  of  the  proposed  research  was  to  bridge  psychometrics  with  cognitive 
psychology.  The  principal  investigator  hopes  that  in  the  future  these  models  will  be  used  in  different 
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FIGURE  6-7 

Operating  Density  Characteristic,  ,  As  a  Function  of  9  for  Each  of  the  Five  Values  of  the  Item 

Score,  0.1,  0.3,  0.5,  0.7,  and  0.9,  Following  the  Normal  Ogive  and  the  Logistic  Models,  with  ag  =  1.0  , 
bg  =  -2.0  and  D  =  1.7  ,  When  the  Functional  Relationship  between  the  Item  Score  zg  and  the 
Difficulty  Parameter  b,f  Is  Given  by  =  60  +  tan[(T/2)*|,]  ,  The  Additional  Curve  Is  the  One  in  the 
Limiting  Situation  Where  z„  Tends  to  Zero.  Closed /Open  Response  Situation. 


S  1-» 


2  1.0 

Z 

o 

5  0.5 


“  0.0 


Z 

o 

1.5 

r 

e 

■ 

z 

a 

u. 

1.0 

ft 

5 

v.% 

y~ 

■ 

< 

0.5 

z 

0.0 

-4 

Nonna  1  Ogive  Model 


4.0  -3.0  -2.0 


-1.0  0.0  1.0  2.0  3.0  4.0 

LATENT  TRAIT  0 

I  nnic  ti/*  Un  #4  a  1 


4.0  -3.0 


-1.0  0.0  1.0  2.0  3.0  4.0 

LATENT  TRAIT  0 


FIGURE  6-8 

Item  Response  Information  Functions,  Itg(&)  ,  (Solid  Line)  and  Item  Information  Function,  Ig(S)  , 
(Dotted  Line)  in  the  Normal  Ogive  and  the  Logistic  Models,  with  ag  =  1.0  ,  b0  =  —2.0  and  D 
=  1.7  .  In  the  Normal  Ogive  Model,  the  Horisontal  Line  Indicates  Common  h,{9)  for  All  Item 
Scores,  0  <  zg  <  1  ,  While  in  the  Logistic  Model  the  Five  Curves  Identical  in  Shape  Indicate  /*,(0) 
for  zg  =  0.1, 0.3, 0.5, 0.7, 0.9  ,  When  the  Functional  Relationship  between  zg  and  bJf  Is  Given  by 
blq  =  b0  +tan((jr/2)ze|  ,  with  the  Dashed  Curve  as  the  One  in  the  Limiting  Situation  Where  zg  Tends 

to  Zero.  Closed/Open  Response  Situation. 
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areas  of  cognitive  psychology  as  well  as  in  those  areas  where  traditionally  psychometric  theory  and 
methods  have  been  used  more  frequently. 

There  are  many  more  graphs  which  clarify  the  shapes  of  various  functions  developed  in  this  part  of 
research.  The  reader  is  directed  to  [1.1.5]  for  them. 
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VII  Informative  Distractors  and  Their  Plausibility  Functions 
in  the  Multiple-Choice  Test  Items 

The  multiple-choice  format  has  been  most  widely  used  in  the  paper-and-pencil  testing  situation,  in 
which  the  examinee  is  to  choose  one  of  the  several  alternative  answers  that  are  prearranged  for  each 
question  as  his  answer.  This  tendency  has  also  been  carried  to  the  computerized  adaptive  testing,  in 
which  the  examinee  is  to  answer  a  sequence  of  questions  selected  out  of  an  item  pool  and  presented  by 
a  computer. 

In  dealing  with  the  multiple-choice  test  item,  three-parameter  logistic  model  (Birnbaum,  1968)  has 
been  popular  among  researchers.  The  model  is  based  upon  the  knowledge  or  random  guessing  principle, 
which  assumes  that  the  examinee  either  knows  the  answer  or  guesses  randomly.  Thus  each  item  is  scored 
either  right  or  wrong,  depending  upon  whether  the  examinee  has  chosen  the  correct  answer  or  one  of  the 
incorrect  alternative  answers.  All  incorrect  alternative  answers  are  treated  as  equivalent  to  each  other, 
therefore,  and  it  is  implicitly  assumed  that  those  distractors  have  identical  operating  characteristics. 
Such  a  family  of  models  belongs  to  the  Equivalent  Distractor  Model. 

We  notice  that  Equivalent  Distractor  Model  does  not  account  for  any  information  provided  by  the 
choices  of  specific  wrong  alternative  answers.  Thus  it  treats  the  multiple-choice  test  item  as  nothing  but 
a  blurred  image  of  the  free-response  test  item.  One  must  question,  however,  if  indeed  the  knowledge  or 
random  guessing  principle  fits  the  examinee’s  behavior  in  testing  situations.  The  answer  seems  to  be 
"No”  in  most  cases,  and  there  exists  the  examinee’s  intentional  choices  of  wrong  alternative  answers, 
or  distractors.  Because  of  this  fact,  we  need  some  other  models  that  do  not  belong  to  the  Equivalent 
Distractor  Model. 

The  principal  investigator  has  proposed  a  new  family  of  models  for  the  multiple-choice  test  item 
(Samejima,  1979),  which  accounts  for  such  intentional  choices  of  distractors  by  examinees.  In  such 
models,  each  incorrect  alternative  answer,  as  well  as  the  correct  answer,  provides  us  with  its  unique 
information,  although  examinees  may  still  guess  randomly  when  they  are  desperate  and  have  no  idea 
as  to  which  alternative  is  more  plausible  than  the  others  as  the  answer  to  the  question.  Such  a  family 
of  models  belongs  to  the  Informative  Distractor  Model. 

The  plausibility  function  of  each  distractor  is  defined  as  the  conditional  probability  assigned  to  the 
choice  of  the  particular  distractor,  given  ability.  If  the  plausibility  function  of  one,  or  more,  distractor  is 
informative,  then  we  will  be  able  to  make  use  of  the  information  in  ability  estimation,  as  well  as  the  one 
provided  by  the  correct  answer.  Thus  the  multiple-choice  test  item  is  no  longer  a  "blurred  image”  of 
the  free-response  test  item,  but  has  a  unique  status  as  a  test  item  which  can  be  more  informative  than 
the  free-response  test  item.  The  principal  investigator’s  family  of  models  includes  these  plausibility 
functions  for  incorrect  alternative  answers. 

To  begin  with,  it  will  be  worthwhile  to  estimate  the  plausibility  functions  of  the' distractors  of  existing 
test  items,  in  order  to  find  out  if,  indeed,  some  distractors  provide  us  with  their  unique  informations. 
Since  we  know  very  little  about  the  behavior  of  wrong  alternative  answers  of  the  multiple-choice  test 
item,  at  this  stage  it  is  more  desirable  to  approach  their  plausibility  functions  without  assuming  any 
mathematical  form.  Thus,  theory  and  methods  for  estimating  the  operating  characteristics  of  discrete 
item  responses,  which  were  summarized  in  Chapter  II,  found  their  full  usefulness  in  this  part  of  research. 

In  this  chapter,  a  brief  outline  of  the  research  will  be  described.  For  more  details  and  information, 
see  [1.1.6]. 


[VH.l]  Iowa  Test  Data 

Iowa  Test  Data  are  based  upon  the  Iowa  Tests  of  Basic  Skills,  Form  6,  Levels  9-14  (Hieronymus  and 
Lindquist,  1971).  These  tests  have  been  designed,  constructed  and  revised  at  the  College  of  Education 
of  the  University  of  Iowa  since  1935,  with  the  general  school  population  in  mind,  and  basically  for 
the  fourth  through  ninth  graders.  There  are  eleven  tests  in  the  battery,  each  of  which  focuses  upon  a 
different  basic  skill.  The  numbers  of  test  items  in  the  eleven  separate  tests  vary  within  the  range  of  74 
through  178,  including  all  the  six  levels. 

Our  data  were  obtained  by  the  courtesy  of  Professor  William  Coffman  of  the  University  of  Iowa. 
They  were  collected  in  three  different  school  systems  in  the  State  of  Iowa,  in  the  years  1971  through 
1977,  using  the  subtests  of  Levels  11,  12  and  13  (cf.  Samejima  and  Trestman,  1980). 

In  the  present  study,  the  results  of  the  2,364  examinees  on  the  Level  11  Vocabulary  Subtest  were  most 
intensively  analyzed.  This  subtest  consists  of  forty-three  test  items,  each  of  which  has  four  alternative 
answers,  i.e.,  one  correct  answer  plus  three  distractors. 

[VII.  2]  Method 

We  mainly  adopted  the  Simple  Sum  Procedure  of  the  Conditional  P.D.F.  Approach  combined  with 
the  Normal  Approach  Method  (cf.  Chapter  II)  for  estimating  the  plausibility  functions.  In  so  doing  we 
needed  some  suitable  substitute  for  the  Old  Test,  since  there  is  no  other  set  of  vocabulary  items  whose 
characteristics  are  already  known.  In  order  to  handle  this  situation,  we  used  the  Level  11  Vocabulary 
Subtest  itself  twice,  i.e.,  first  as  the  Old  Test  and  later  as  the  set  of  "unknown”  test  items.  Thus  on 
the  first  stage,  each  item  was  scored  either  "right”  or  "wrong”,  and  the  normal  ogive  model  on  the 
dichotomous  response  level  was  assumed.  We  accepted  this  model  tentatively,  and  item  parameter 
estimation  was  performed  for  each  of  the  forty-three  test  items.  On  the  second  stage,  these  forty-three 
test  items  were  treated  as  "unknown”  multiple-choice  test  items  with  polychotomous  item  responses, 
and  for  each  item  we  obtained  an  estimated  item  characteristic  function  for  the  correct  answer  and  an 
estimated  plausibility  function  for  each  of  the  three  distractors.  The  former  was  then  compared  with 
the  hypothesized  normal  ogive  function  as  a  part  of  the  model  validation  process.  If  the  normal  ogive 
model  was  validated,  then  we  would  accept  the  estimated  plausibility  functions  of  the  distractors.  If 
not,  we  would  examine  the  invalidated  test  items,  and  either  assume  more  suitable  models  for  them  or 
discard  these  items,  to  produce  a  new  Old  Test  and  would  repeat  the  estimation  process  all  over  again. 

It  was  assumed  that  the  response  tendencies  of  our  2,364  examinees  behind  the  forty-three  test 
items  had  a  multinormal  distribution  as  their  joint  distribution.  If  there  existed  a  single  dominating 
common  factor  behind  these  forty-three  response  tendencies,  then  it  would  be  defined  operationally  as 
the  vocabulary  ability  in  question.  Consequently,  the  ability  distribution  for  these  2,364  subjects  would 
also  be  normal,  and  the  origin  and  the  unit  of  the  scale  would  be  defined  at  its  mean  and  standard 
deviation,  respectively. 

The  tetrachoric  correlation  coefficient  was  obtained  for  each  pair  of  test  items,  using  the  program 
written  by  the  principal  investigator.  The  resulting  inter-item  correlation  matrix  was  factor  analyzed, 
using  the  computer  program  for  principal  factor  solution  in  Biomedical  Computer  Programs  Multivariate 
Analysis  Series  4  (BMDP4M).  The  communalities  were  estimated  iteratively,  with  the  squared  multiple 
correlation  of  each  variable  with  all  other  variables  as  its  initial  estimate.  If  we  found  a  relatively 
powerful  second  factor,  etc.,  in  addition  to  the  dominating  first  factor,  however,  we  would  eliminate 
some  appropriate  items  from  the  Old  Test  to  resolve  the  clusters,  and  factor  analyze  the  reduced 


correlation  matrix  again,  until  we  reached  a  single  general  factor  pattern. 

The  estimated  item  discrimination  parameter,  ag,  and  item  difficulty  parameter,  bg  ,  were  given 
by 

(7.1)  =  Pg(l-P2a)~1/2 

and 

(7.2)  bg  =  %  p~l 

where  pg  is  the  factor  loading  of  item  g  on  the  first  common  factor,  and  7tf  is  the  normal  deviate 
corresponding  to  the  proportion  correct  pg  of  each  item  g  . 

[VII. 3]  Results 

The  same  procedure  leading  to  factor  analysis  was  applied  for  each  of  the  other  ten  Level  11  Iowa 
Subtests,  and  the  resulting  sets  of  eigenvalues  are  shown  in  Table  7-1,  except  for  those  of  the  Level  11 
Reading  Comprehension  Subtest  (R).  We  can  see  that  for  the  Vocabulary  Subtest  the  set  of  eigenvalues 
indicates  a  single  common  factor  structure,  although  there  exist  relatively  powerful  second  common 
factors  for  several  other  subtests.  This  may  be  due  to  the  fact  that  reading  ability  is  always  required 
in  any  subtest  in  addition  to  its  core  performance,  while  in  Vocabulary  Subtests  those  two  abilities  are 
close  in  nature. 

As  expected,  it  turned  out  that  the  factor  loadings  on  the  first  common  factor  were  all  positive,  and, 
except  for  those  of  items  24  and  44  ,  they  are  greater  than  0.300  ,  ranging  from  0.316  for  item  39  to 
0.691  for  item  30.  The  largest  cluster  of  factor  loadings  we  can  find  in  those  common  factors  excluding 
those  in  the  first  one  is  the  pair  in  the  fourth  factor,  i.e.,  0.393  for  item  33  and  0.368  for  item  44.  Most 
of  the  factor  loadings  on  those  other  common  factors  are  less  than  0.300  in  absolute  value.  From  this 
result,  the  decision  was  made  to  define  the  first  common  factor  operationally  as  the  vocabulary  ability 
and  to  use  the  whole  set  of  items  in  the  Subtest  as  the  Old  Test.  The  estimated  item  parameters  ag 
and  bg  are  shown  in  Table  7-2,  together  with  the  proportion  correct  pg  and  the  fnormal  deviate  7„  , 
for  each  of  the  forty-three  items. 

Figure  7-1  presents  the  square  root  of  the  test  information  function  of  the  Old  Test  by  a  solid  line, 
and  also  its  approximation,  i.e.,  the  polynomial  of  degree  7  obtained  by  the  method  of  moments  using 
the  interval  of  6  ,  (-5.0,  5.0),  by  a  dotted  line.  The  actual  formula  of  the  latter  is  given  by 

(7.3)  [I(6)}~1/2  =  3.1915950  —  0.236049720  —  O.2732255O02 

+  O.O2624825903  +  O.O1231557804  -  0.00114859510s 
-  0.000227876450®  +  O.OOOO1832269707  . 


The  method  of  moments  was  applied  for  four  different  intervals  of  9  ,  i.e.,  (-4.0,  4.0),  (-4.5,  4.5), 
(-5.0,  5.0)  and  (-5.5,  5.5),  and  the  result  shown  in  Figure  7-1  provided  us  with  the  best  fit.  The 


TABLE  7-1 


Eigenvalues  of  the  Matrix  (R-V)  for  Each  of  the  Ten  Level  11  Subtests  Obtained  As  the  Results  of  the 

Principal  Factor  Solution  of  Factor  Analysis. 
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0.0437 

-0.0808 

0.0362 

-0.0982 

0.0184 

-0.1098 

0.0151 

-0.1244 

0.0083 

-0.1458 

-0.0138 

-0.1598 

-0.0221 

-0.1691 

-0.0294 

-0.1934 

-0.0397 

-0.2323 

-0.0425 

-0.0492 

-0.0650 

-0.0788 

-0.0884 

-0.0954 

-0.1259 

-0.1422 

-0.1520 

-0.1637 

-0.1676 

-0.1872 

-0.2052 

-0.2239 

*"**'-*  _  *f  ,  «r_ 
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Estimated  Item  Discrimination  Parameter  ag  and  Item  Difficulty 
Parameter  b„  ,  Proportion  Correct  pg  and  Normal  Deviate  %  , 
for  Each  of  the  Forty-Three  Old  Test  Items  of  the  Iowa  Level  11 
Vocabulary  Subtest. 


Item 

9 

Discrimination 

Parameter 

*9 

Difficulty 

Parameter 

*9 

Proportion 

Correct 

pg 

normal 

Deviate 

% 

24 

0.196 

-4.257 

0.79315 

-0.81740 

25 

0.829 

-1.000 

0.73816 

-0.6376P 

26 

0.614 

-0.821 

0.66624 

-0.4295L: 

27 

0.594 

-0.340 

0.56895 

-0.17370 

28 

0.669 

-0.900 

0.69162 

-0.50045 

29 

0.867 

-1.077 

0.75973 

-0.70543 

30 

0.956 

-0.557 

0.64975 

-0.38465 

31 

0.938 

-0.179 

0.54865 

-0.12225 

32 

0.940 

-0.803 

0.70897 

-0.55038 

33 

0.434 

-2.331 

0.82318 

-0.92755 

34 

0.598 

-1.210 

0.73266 

-0.62088 

35 

0.489 

-0.569 

0.59856 

-0.24962 

36 

0.657 

-0.987 

0.70601 

-0.54177 

37 

0.351 

0.577 

0.42428 

0.19096 

38 

0.665 

-0.468 

0.60237 

-0.25949 

39 

0.333 

-0.676 

0.58460 

-0.21368 

40 

0.683 

0.402 

0.41032 

0.22672 

41 

0.531 

-0.948 

0.67174 

-0.44472 

42 

0.436 

0.258 

0.45897 

0.10303 

43 

0.672 

-0.867 

0.68S70 

-0.48370 

44 

0.143 

4.175 

0.27665 

0.59282 

45 

0.898 

-0.357 

0.59433 

-0.23870 

46 

0.612 

-0.318 

0.56599 

-0.16617 

47 

0.494 

-0.781 

0.63536 

-0.34608 

48 

0.849 

0.054 

0.48604 

0.03500 

49 

0.421 

-0.626 

0.59602 

-0.24306 

50 

0.346 

-0.250 

0.53257 

-0.08173 

51 

0.664 

-0.420 

0.59179 

-0.23215 

52 

0.640 

0.217 

0.45347 

0.11690 

53 

0.402 

0.526 

0.42217 

0.19635 

54 

0.573 

0.126 

0.47504 

0.06261 

55 

0.667 

-0.342 

0.57530 

-0.18988 

56 

0.593 

1.007 

0.30372 

0.51373 

57 

0.370 

0.398 

0.44501 

0.13828 

58 

Q.416 

0.782 

0.38198 

0.30028 

59 

0.491 

-0.731 

0.62648 

-0.32254 

60 

0.678 

-0.170 

0.53807 

-0.09557 

61 

0.519 

0.748 

0.36506 

0.34497 

62 

0.938 

-0.485 

0.62986 

-0.33148 

63 

0.637 

-0.398 

0.58460 

-0.21368 

64 

0.818 

-0.042 

0.51058 

-0.02652 

6S 

0.606 

0.595 

0.37902 

0.30806 

66 

0.604 

-0.376 

0.57699 

-0.19420 

71 


LATENT  TRAIT  0 

FIGURE  7-1 

Square  Root  of  Test  Information  Function  [/(0)]1/2  of  the  Level  11  Vocabulary  Subtest  (Solid  Line) 
and  Its  Approximation  by  the  Polynomial  of  Degree  7  Obtained  by  the  Method  of  Moments  with  the 

Specified  Interval  of  8  ,  [-5.0,  5.0]  (Dotted  Line). 


Square  Root  of  Test  Information  Function  [/*  (r)j1/3  of  the  Level  11  Vocabulary  Subtest  Obtained 
from  the  Polynomial  Transformation  of  9  to  r  (Dotted  Line)  and  Its  Target  (Solid  Line). 


polynomial  for  transforming  8  to  r  was  obtained  from  this  result,  and  it  turned  out  to  be  a  polynomial 
of  degree  8  such  that 

(7.4)  t{8)  =  0.00000000  +  0.797898745  -  O.O295O621502 

-  0.022768792 83  +  O.OO164O516204  +  0.0006157789 10s 

-  0.000047858 1278e  -  0.000008 138444607  +  0.000000572584280s  . 

Figure  7-2  presents  the  square  root  of  the  test  information  function  of  r  thus  obtained  by  using  the 
approximated  polynomial  for  [/(0)]1/2  which  was  given  by  (7.3)  and  the  derivative  of  r  obtainable 
from  (7.4).  Since  the  interval  of  9  ,  (-4.0,  4.0),  corresponds  to  the  interval  of  r  ,  (-  2.44244,  2.02098), 
the  latter  is  shown  by  arrows  in  Figure  7-4.  We  can  see  that  for  this  interval  of  r  the  approximated 
square  root  of  the  test  information  function,  [/‘(r)]1/2  >  is  practically  constant. 

The  maximum  likelihood  estimate,  6,  of  6  was  obtained  for  each  individual  subject  from  his 
response  pattern  on  the  Old  Test  items,  and  was  transformed  to  that  of  r  through  (7.4).  On  this 
stage,  eight  subjects  whose  8,  are  outside  of  the  interval  (-  3.75,  3.75)  were  excluded  permanently 
from  the  rest  of  the  research,  so  that  the  number  of  subjects  was  reduced  to  2,356. 

The  method  of  moments  was  applied  for  the  set  of  2,356  f,  ’s  to  produce  the  best  fitted  polynomials 
of  degrees  3  and  4  in  the  least  square  principle  (cf.  Samejima  and  Livingston,  1979),  and  they  turned 
out  to  be 

(7.5)  £*(?)  =  0.42358084  —  0.0468130197 

-  0.13270786t2  +  0.02001 4202t3 

and 

(7.6)  §*(?)  =  0.45023559  —  0.044232853?  —  0.20387563?2 

+  0.018406862?3  +  0.022176405f4  , 

respectively. 

Table  7-3  presents  the  frequency  distribution  of  the  2,356  t's  with  respect  to  the  types  of  the 
conditional  distribution  of  r  ,  given  r,  ,  in  both  Degree  3  and  4  Cases.  These  types,  1  through  7, 
indicate  Pearson’s  Types  (Elderton  and  Johnson,  1969;  Johnson  and  Kits,  1970)  which  were  assigned 
by  evaluating  the  values  of  the  criterion  k  .  We  can  see  in  this  table  that  in  both  Degree  3  and  4  Cases 
more  than  sixty  percent  of  the  cases  belong  to  the  normal  distribution,  while  most  of  the  others  belong 
to  the  Beta  distribution,  i.e.,  either  Pearson’s  Type  1  or  2.  There  are  some  cases  whose  conditional 
distributions  of  r  are  undefined,  either  due  to  a  negative  value  for  an  estimated  even  conditional 
moment  or  to  a  negative  value  for  the  estimated  conditional  probability  density.  Those  subjects  were 
excluded  from  the  rest  of  the  research. 

The  above  results  support  our  choice  of  the  Normal  Approach  Method  in  both  Degree  3  and  4  Cases. 
Moreover,  a  close  examination  of  the  skewness  and  kurtosis  indices  further  discloses  the  fact  that,  in 


TABLE  7-3 


most  cases  where  the  conditional  distributions  of  r  belong  to  Pearson’s  Types  1  or  2  distribution,  they 
are  very  close  to  0.0  and  3.0,  respectively,  i.e.,  the  numbers  which  characterize  the  normal  distribution. 

Since  these  two  sets  of  results  are  very  similar  to  each  other,  from  there  we  dealt  solely  with  Degree 
4  Case.  It  is  worth  noting,  however,  that  the  results  of  Degree  3  Case  would  be  just  as  respectable  as 
those  of  Degree  4  Case,  in  spite  of  the  fact  that  the  degree  of  the  polynomial  approximating  <7*(f)  is 
one  less  and  as  small  as  3. 

Figure  7-3  exemplifies  the  resulting  estimated  item  characteristic  function  and  estimated  plausibility 
functions  for  each  of  the  four  items,  i.e.,  items  37,  43,  44  and  45.  For  most  of  the  forty-three  items, 
the  fitness  of  the  estimated  item  characteristic  function  with  the  initial  normal  ogive  curve,  which  are 
drawn  by  dotted  and  solid  lines,  respectively,  is  as  good  as  that  of  item  43  or  45,  although  for  some 
items  it  is  a  little  worse,  as  is  illustrated  in  the  first  graph  for  item  37.  The  only  exception  is  item 
44,  whose  four  estimated  functions  are  also  shown  in  Figure  7-3.  With  all  these  things  considered,  it 
was  decided  to  accept  the  first  Old  Test,  and  not  to  repeat  the  whole  procedure  using  a  modified  Old 
Test.  The  estimated  plausibility  functions  for  items  37  and  45  indicate  the  existence  of  some  informative 
distractors  for  these  items,  although  some  other  distractors  do  not  explicitly  show  their  informativeness, 
such  as  the  one  drawn  by  the  shortest  dashed  line  in  each  of  the  two  graphs.  For  item  43  the  three 
distractors  did  not  prove  to  be  very  informative.  Note,  however,  these  distractors  may  be  informative 
on  much  lower  levels  of  ability  6  . 

The  model  validation  was  further  made  by  computing  the  chi-square  statistics  testing  the  bivariate 
normality  for  each  pair  of  response  tendencies.  The  results  turned  out  to  be  fairly  supportive. 

[VII. 4]  Discussion 

The  item  analysis  on  the  Iowa  Test  Data  turned  out  to  be  easier  and  more  successful  than  the 
principal  investigator  had  anticipated.  Most  of  the  test  items  are  not  likely  to  follow  the  Equivalent 
Distractor  Model,  to  which  the  three-parameter  logistic  or  normal  ogive  model  belongs.  We  have 
discovered  many  distractors  which  are  informative,  and  the  results  suggest  that  most  of  the  items 
follow  the  Informative  Distractor  Model.  Methodologies  involved  in  the  present  study  appear  to  be 
promising,  and  they  will  find  their  usefulness  in  many  other  future  studies. 

The  next  logical  step  will  be  to  find  out  how  we  can  make  the  best  use  of  the  information  obtainable 
from  the  distractors  as  well  as  from  the  correct  answers,  in  or  ’  .r  to  increase  the  efficiency  of  ability 
estimation.  It  is  also  necessary  to  collect  data  for  subjects  of  lower  levels  of  ability  in  order  to  find  the 
information  provided  by  all  distractors.  There  is  a  good  prospect  that  the  new  family  of  models  for  the 
multiple-choice  test  item  will  find  its  place.  In  this  brief  summary  only  flavor  of  this  part  of  research 
was  presented.  There  are  many  more  interesting  results  for  this  set  of  data,  and  the  reader  is  directed 
to  [1. 1.6).  All  the  results  on  the  other  Level  11  Subtests  and  those  on  Levels  12  and  13  Subtests  are 
excluded  in  the  present  report  because  of  the  shortage  of  space. 
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VIII  Analysis  of  Shiba’s  Data  Collected  upon  His  Word/Phrase 
Comprehension  tests:  Comparison  of  Tetrachoric  Method 
and  Logist  5  on  Empirical  Data 

As  was  pointed  out  earlier,  three-parameter  logistic  model  (Birnbaum,  1968)  has  earned  its  popularity 
in  the  past  years  as  a  model  for  the  multiple-choice  test  item.  This  tendency  was  facilitated  by  the 
availability  of  computer  programs,  which  are  represented  by  Logist  5  (Wingersky,  Barton  and  Lord, 
1982),  for  estimating  the  three  item  parameters.  Logist  5  can  be  used  not  only  for  the  item  parameter 
estimation  in  the  three-parameter  logistic  model,  but  also  in  the  (two-parameter)  logistic  model,  by 
setting  the  third  parameter  equal  to  zero. 

In  this  part  of  research,  Logist  5  and  Tetrochoric  Method,  the  latter  of  which  was  outlined  in  the 
preceding  chapter  in  analyzing  the  Iowa  Test  data,  were  used  for  the  item  analysis  of  empirical  data, 
and  comparison  was  made  between  the  estimated  item  parameters  obtained  by  assuming  the  normal 
ogive  model  and  those  obtained  by  Logist  5  assuming  the  (two-parameter)  logistic  model.  It  is  well 
known  (Birnbaum,  1968)  that  (two-parameter)  logistic  model  provides  us  with  a  good  approximation 
to  the  normal  ogive  model  if  we  set  the  scaling  factor  D  equal  to  1.7  .  In  some  cases,  item  parameter 
estimation  was  also  made  by  Logist  5  assuming  the  three-parameter  logistic  model,  and  comparison  was 
extended  to  those  results  also. 

The  research  report  for  this  part  of  research  contains  three  hundred  thirty-five  pages,  and  it  is 
difficult  to  even  summarize  the  results.  For  this  reason,  only  limited  illustrations  wifi  be  given  in  this 
chapter.  For  details  and  other  results  and  findings,  see  (1.2.7). 

[VIII.l]  Shiba’s  Data 

Empirical  data  adopted  here  were  taken  from  the  test  data  provided  by  the  courtesy  of  Professor 
Sukeyori  Shiba  of  the  University  of  Tokyo,  Japan.  Shiba’s  research  on  the  measurement  of  word/phrase 
comprehension  has  been  introduced  earlier  (Samejima,  1980).  The  battery  of  tests  used  for  the  con¬ 
struction  of  Shiba’s  word/phrase  comprehension  scale  consists  of  thirteen  tests,  API,  AP2,  Al,  A2,  A3, 
A4,  A5,  A6,  J 1 ,  J2,  Si,  S2  and  U  .  Each  of  these  thirteen  tests  contains  thirty  to  sixty  multiple-choice 
items,  each  of  which  has  a  set  of  five  alternatives.  These  tests  differ  in  difficulty,  and  each  is  designed  for 
a  different  age  group  of  subjects,  ranging  from  four  years  of  age  to  the  ages  of  college  students.  There 
are  subsets  of  items  included  in  two  tests,  which  are  adjacent  to  each  other  in  difficulty.  For  example, 
items  37  through  56  of  Test  Jl  are  also  items  1  through  20  of  Test  J2.  There  are  480  test  items  in 
total.  The  number  of  examinees  used  by  Shiba  for  the  ability  scale  construction  varies  between  219 
preschoolers  for  Test  API  and  924  second  graders  of  senior  high  schools  for  Test  SI  . 

The  principal  investigator  has  found  Shiba’s  tests  very  well  constructed.  Professor  Shiba  and  she 
have  been  collaborating  for  the  past  ten  years,  and  she  decided  to  adopt  Shiba’s  data  for  this  part  of 
research.  Some  of  the  results  obtained  by  the  Tetrochoric  Method  were  taken  from  Professor  Shiba’s 
work  itself. 

Out  of  Shiba’s  thirteen  tests,  four  tests  were  chosen  for  the  present  research,  i.e.,  A5,  A6,  Jl  and 
J2.  The  examinees  who  took  these  four  tests  in  Shiba’s  original  data  are  as  the  following. 

Test  A 5:  599  fifth  graders  in  elementary  schools 
Test  A6:  412  sixth  graders  in  elementary  schools 


Test  Jl:  614  first  graders  in  junior  high  schools 
Test  J2:  758  third  graders  in  junior  high  schools 

These  groups  of  examinees  and  their  performances  are  called,  for  brevity,  A5/0599  Case,  A6/0412 
Case,  Jl/0614  Case  and  J2/0758  Case,  respectively.  There  are  also  461  second  graders  in  junior  high 
school  who  took  Test  Jl  in  Shiba’s  original  data.  In  order  to  increase  the  number  of  examinees,  this 
group  of  461  subjects  and  their  performances  were  added  to  the  Jl/0614  Case,  to  provide  us  with  the 
Jl/1075  Case.  This  case  was  further  joined  by  an  additional  group  of  1,184  students  of  four  different 
junior  high  schools  in  Tokyo,  to  whom  Test  Jl  was  administered  in  some  other  research  of  Shiba’s.  We 
shall  call  this  largest  group  of  examinees  and  their  performances  Jl/2259  Case.  Thus  we  have  six  cases 
in  total,  with  three  of  them  partly  overlapping. 

When  the  item  parameter  estimation  was  made  by  Logist  5,  in  some  cases  two  or  more  tests  and 
the  corresponding  samples  of  examinees  were  combined,  in  order  to  increase  the  number  of  test  items 
and  hence  to  improve  the  accuracy  of  estimation.  Table  8-1  presents  the  resulting  combinations  of  tests 
and  the  numbers  of  examinees.  When  two  or  more  adjacent  tests  are  combined,  the  number  of  items  is 
less  than  the  sum  total  of  the  numbers  of  items  of  the  separate  tests  because  of  the  overlapping  items. 

[VIII.  2]  Results 

Figure  8-1  presents  the  estimated  ability  distribution  of  each  of  the  original  and  combined  examinee 
groups,  which  was  obtained  through  the  item  parameters  estimated  by  the  Tetrochoric  Method. 

Figure  8-2  shows  eight  scatter  diagrams  of  the  estimated  item  discrimination  parameters  of  Test 
Jl.  They  consist  of  four  pairs,  in  each  of  which  the  Logist  5  results  of  "Cp-zero”  (left)  and  ”ca-free” 
(right)  are  compared  with  the  results  of  the  Tetrachoric  Method.  These  results  are  obtained  for  different 
cases,  and  they  are  specified  in  the  captions  of  separate  pairs.  We  can  see  a  substantial  consistency 
between  the  two  sets  of  estimated  item  discrimination  parameters  in  the  first  graph  of  each  of  the  four 
pairs,  i.e.,  when  (two-parameter)  logistic  model  is  assumed  in  using  Logist  5  ,  whereas  there  exists  little 
consistency  in  the  second  graph  of  each  pair,  i.e.,  when  three-parameter  logistic  model  is  assumed.  We 
notice  that  the  greatest  consistency  is  observed  in  the  first  graph  of  the  first  pair  of  scatter  diagrams 
and  in  the  first  pair  of  the  third  pair.  They  are  Case  Jl/1075:cB-zero  of  Logist  5  against  Jl/1075  Case 
of  the  Tetrachoric  Method  and  Case  Jl/2259:c9-zero  against  Jl/2259  Case  of  the  Tetrachoric  Method, 
i.e.,  the  only  two  situations  which  concern  the  same  examinee  group  both  in  using  Logist  5  and  in  using 
Tetrachoric  Method,  and  no  guessing  parameter  is  assumed  in  using  Logist  5  .  This  fact  suggests  that 
these  two  methods  provide  us  with  consistent  results  when  the  item  parameter  configurations  are  such 
as  those  of  Test  Jl,  if  the  sample  size  is  1,000  or  above.  The  corresponding  eight  scatter  diagrams  for 
the  estimated  item  difficulty  parameters  are  presented  in  Figure  8-3.  We  can  see  a  similar  tendency 
as  we  have  observed  for  the  estimated  discrimination  parameters,  although  inconsistency  between  the 
two  sets  of  estimates  is  less  conspicuous  when  three-parameter  logistic  model  is  assumed  in  using  Logist 
5.  These  tendencies  were  carried  out  almost  as  they  are  even  after  certain  scale  adjustments  had  been 
made  in  order  to  make  the  comparison  more  adequate  (cf.  [1.2.7],  Chapters  6  and  8). 

Figure  8-4  presents  four  graphs  which  clarify  how  estimated  item  parameters  differ  when  three- 
parameter  logistic  model  is  assumed  in  comparison  with  those  when  (two-parameter)  logistic  model  is 
assumed.  The  first  two  graphs  concern  with  the  group  of  1,075  examinees  and  the  last  two  with  the 
group  of  2,259  examinees,  and  in  each  pair  the  first  graph  concerns  with  the  adoption  of  the  (two- 
parameter)  logistic  model  and  the  second  with  the  three-parameter  logistic  model.  In  each  graph,  the 
estimated  item  difficulty  parameters  of  the  items  of  Test  Jl  are  taken  on  the  abscissa,  and  the  estimated 
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Tests,  Numbers  of  Items,  Number  of  Examinees  and  Other  Information 
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FIGURE  8-1 


Estimated  Ability  Distributions  (Solid  Lines)  of  the  A5/0599,  A6/0412,  Jl/0614  and  J2/0758  Cases, 
Those  of  the  Jl/1075  Case  (Long  Dashed  Line)  and  of  the  Jl/2259  Case  (Short  Dashed  Line),  Together 
with  Those  (Dotted  Lines)  of  the  Combined  Examinee  Groups,  A5-A6  ,  J1-J2  and  A5-A6-J1-J2. 
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discrimination  parameters  are  taken  on  the  ordinate.  Both  the  results  of  the  Tetrachoric  Method  and 
those  of  Logist  5  are  plotted  in  each  of  the  four  graphs,  to  make  the  total  number  of  points  110.  To  avoid 
confusion,  there  are  five  different  symbols,  i.e.,  a  ,  *  ,  +  ,  A  and  -f  ,  in  these  graphs, 

and  an  arrow  is  drawn  for  each  item  from  the  point  indicating  the  result  of  the  Tetrachoric  Method  to 
that  of  Logist  5  . 

Comparison  of  the  first  graph  of  Figure  8-4  with  the  second,  and  of  the  third  graph  with  the  fourth, 
discloses  how  radically  the  two  estimated  parameters  of  these  items  of  Test  Jl  are  enhanced  because 
of  the  existence  of  the  guessing  parameter  cg  when,  in  using  Logist  5,  three-parameter  logistic  model 
is  assumed.  These  tendencies  are  similarly  observed  in  both  pairs,  where  the  examinee  groups  of  1,075 
individuals  and  of  2,259  examinees  are  involved,  respectively.  These  tendencies  were  carried  out  almost 
as  they  are  even  after  a  certain  scale  adjustment  had  been  made  in  order  to  make  the  comparison  more 
adequate  (cf.  [1.2.7] ,  Chapter  8). 

Table  8-2  presents  the  direct  estimates  of  the  mean  and  the  standard  deviation  of  the  distribution  of 
the  maximum  likelihood  estimate  of  ability,  6  ,  for  each  of  the  five  examinee  groups,  which  are  shown 
as  Examinee  Group  2,  obtained  from  the  combined  linear  relationship  in  each  case.  Since  there  are 
more  than  one  way  of  obtaining  these  two  values  in  each  case,  all  of  these  results  are  presented  in  Table 
8-2.  In  this  table,  Examinee  Group  1  indicates  the  group  of  examinees  upon  which  the  item  parameters 
were  estimated  by  the  Tetrachoric  Method,  and  Examinee  Group  2  means  the  group  of  individuals  upon 
which  they  were  estimated  by  Logist  5.  Thus  the  mean  and  the  standard  deviation  of  the  distribution 
of  0  of  Case  J1-J2,  for  example,  can  be  estimated  in  two  ways,  i.e.,  through  the  scatter  diagrams  of 
the  estimated  parameters  of  the  items  of  Test  Jl,  and  through  those  of  the  items  of  Test  J2.  In  the 
same  table,  the  weighted  averages  of  the  estimated  mean  and  standard  deviation  of  the  distribution  of 
6  are  also  given  for  each  examinee  group.  The  weight  adopted  here  is  the  number  of  the  examinees  in 
Examinee  Group  1.  We  can  see  in  these  results  that  most  estimates  for  the  same  group  of  examinees 
are  close  to  each  other. 


Figures  8-5  through  8-8  present  five  estimated  item  characteristic  functions  for  each  of  four  items 
of  Tests  A5,  A6,  Jl  and  J 2,  respectively.  In  each  graph,  the  result  based  upon  the  estimated  item 
parameters  obtained  by  the  Tetrachoric  Method  is  drawn  by  a  solid  line,  and  all  the  other  four  curves 
of  different  lengths  of  dashes  concern  with  those  based  upon  the  estimated  item  parameters  by  Logist 
5  .  We  notice  that  there  are  basically  two  sets  of  estimated  item  parameters  which  were  obtained  by 
Logist  5,  i.e.,  one  based  upon  either  Case  A5-A6  or  Case  J1-J2  and  the  other  upon  Case  A5-A6-J1-J2. 
For  brevity,  we  shall  call  the  former  approach  Method  A  and  the  latter  Method  B.  In  each  of  these  two 
cases  the  estimated  item  parameters  were  adjusted  twice,  i.e.,  first  on  the  assumption  that  the  mean 
and  the  standard  deviation  of  the  distribution  of  0  are  the  same  as  those  of  the  distribution  of  0  ,  and, 
secondly,  without  this  assumption.  In  each  graph  of  Figures  8-5  through  8-8,  the  results  based  upon 
Method  A  and  upon  the  first  and  the  second  scale  adjustments  are  drawn  by  a  long  dashed  line  and 
a  short  dashed  line,  respectively,  and  those  based  upon  Method  B  and  upon  the  first  and  the  second 
adjustments  are  shown  by  a  dashed  line  of  medium  length  and  dotted  line,  respectively. 

FYom  these  results,  we  can  say  the  following. 

(1)  For  many  items,  the  two  Logist  5  results  based  upon  the  second  scale  adjustment  are  close 
to  each  other,  while  those  based  upon  the  first  scale  adjustment  are  substantially  different 
from  each  other. 

(2)  In  addition  to  the  above,  the  two  Logist  5  results  based  upon  the  second  scale  adjustment 
also  tend  to  be  closer  to  the  result  of  the  Tetrachoric  Method. 
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TETRACHORIC  •„ 


Estimated  lUm  Discrimination  Parameters  of  the  55  Items  of  Test  Jl  Obtained  by  Legist  5  Plotted  ftgtiut  TkoM  Obtained  by 
the  Tetrachoric  Method.  In  Using  Logie t  5,  Logistic  Model  Is  Assumed  in  the  Graph  on  the  Left  Hand  Side  and  Three-parameter 
Logistic  Model  Is  Assumed  in  the  Graph  on  the  Right  Hand  Side.  Both  Sets  of  Estimates  in  Each  Graph  are  the  Original  One, 
i.e.  before  Any  Scale  Adjustment.  The  Best  Fitted  Linear  Relationship  When  the  Logistic  Model  is  Assumed  is  Drawn  by  a 
Thin,  Solid  Line  and  the  One  Based  upon  Both  Parameters  Are  Shown  by  a  Thick,  Solid  Line.  Jl/1075  Case  (Upper  Graphs) 
and  J 1/0614  Case  on  the  Abscissa  and  Jl/1075  Case  on  the  Ordinate  (Lower  Graphs). 
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FIGURE  8-2  (Continued) 

Jl/2259  Case  (Upper  Graphs)  and  Jl/0614  Case  on  the  Abscissa  and  Jl/2259  Case  on 

(Lower  Graphs). 

the  Ordinate 

www; 


Jl/0614  Case  on  the  Abscissa  and  Jl/1075  Case  on  the  Ordinate. 


FIGURE  8-3  (Continued) 
J 1/2259  Cue. 


FIGURE  8-4 


Estimated  Item  Discrimination  Parameter  ha  Plotted  against  Estimated  Item  DifficultyParameter 


bg  ,  Which  Were  Obtained  by  the  Tetrachoric  Method  and  Those  Which  Were  Obtained  by  Logist  5 


Assuming  Logistic  Model  (Upper  Graph)  and  Three-parameter  Logistic  Model  (Lower  Graph), 
Respectively,  for  Each  of  the  55  Items  of  Test  Jl.  For  Each  Item,  an  Arrow  Is  Drawn  from  the 
Tetrachoric  Method  Result  to  the  Logist  5  Result.  Jl/1075  Case. 
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These  two  findings  seem  to  justify  the  second  scale  adjustment,  and  also  to  support  the  consistency  in 
the  results  of  the  two  methods,  i.e.,  Tetrachoric  Method  and  Logist  5. 

Figures  8-9  and  8-10  present  the  results  of  J 1/1075  and  J 1/2259  Cases  for  each  of  four  items  of  Test 
Jl.  Again  in  each  of  these  graphs  a  solid  curve  represents  the  estimated  item  characteristic  function 
in  the  normal  ogive  model,  whose  item  parameters  were  originally  obtained  by  the  Tetrachoric  Method 
and  then  adjusted.  The  other  four  curves  are  based  upon  the  estimated  item  parameters  obtained  by 
Logist  5  ,  with  two  of  them  by  assuming  (two-parameter)  logistic  model  and  the  other  two  by  assuming 
three-parameter  logistic  model. 

FYom  these  results  we  can  find  the  following. 

(3)  For  many  items  the  logistic  curve  obtained  with  the  second  scale  adjustment,  which  is 
shown  by  a  short  dashed  line,  is  very  close  to  the  normal  ogive  curve,  which  is  drawn  by 
a  solid  line. 

(4)  For  certain  items,  the  three-parameter  logistic  curves  are  drastically  different  from  the 
other  three  curves,  whereas  for  many  other  items  they  are  close  to  the  other  three  for 
the  range  of  8  ,  (—1.0,  oo)  ,  regardless  of  the  fact  that  the  estimated  discrimination 
parameters  are  substantially  larger. 

[VIII. 3]  Discussion 

Tetrachoric  Method  has  been  criticised  recently  for  such  reasons  that:  1)  the  tetrachoric  correlation 
matrix  does  not  turn  out  to  be  positive  definite,  and  2)  the  correlation  does  not  handle  too  difficult  and 
too  easy  items  well.  While  the  second  criticism  makes  sense,  the  principal  investigator  does  not  agree 
with  this  negative  standpoint.  First  of  all,  it  should  be  remembered  that,  in  using  the  tetrachoric  cor¬ 
relation  coefficient,  we  need  the  assumption  of  bivariate  normality  for  each  pair  of  response  tendencies. 
Care  must  be  taken,  therefore,  to  make  sure  that  our  subjects  are  a  practically  randomly  selected  sample 
of  a  ’’non-restricted”  population,  before  we  use  the  Tetrachoric  Method.  Secondly,  the  first  criticism 
is  mostly  based  upon  results  obtained  by  poorly  written  computer  programs  of  tetrachoric  correlation 
coefficient.  With  a  well  written  computer  program  and  a  suitable  group  of  subjects,  the  method  can 
be  useful,  unless  the  test  includes  so  many  too  difficult  and/or  too  easy  items.  This  was  proved  by  the 
success  of  the  method  in  analyzing  Shiba’s  Data,  and  also  in  analyzing  the  Iowa  Test  Data,  which  were 
introduced  in  Chapter  VII. 
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IX  Item  Parameter  Estimation  Using  Logist  5  on  Simulated 
Data 

It  has  been  shown  in  the  result  of  the  item  analysis  of  Shiba’s  Data  that  the  discrimination  parameter 
estimated  by  Logist  5  tends  to  be  greater,  if  the  third  parameter  cg  is  set  free,  and  to  a  lesser  extent 
the  same  is  true  with  the  difficulty  parameter.  It  has  also  been  observed  that  in  spite  of  this  fact  the 
resulting  estimated  item  characteristic  function  tends  to  be  closer  to  those  obtained  by  the  Tetrachoric 
Method  and  by  Logist  5  with  zero  as  the  set  value  of  cg  ,  than  those  enhanced  ag  and  bg  suggest. 
This  fact  should  be  taken  as  a  warning  to  researchers  who  have  been  accepting  ag  as  the  discrimination 
power  and  bg  as  the  difficulty  index  of  the  item  when  three-parameter  logistic  model  is  assumed.  The 
truth  L  that,  unlike  in  the  two-parameter  model  such  as  the  normal  ogive  and  the  logistic  models,  in 
the  three-parameter  model  both  the  discrimination  and  the  difficulty  of  the  item  are  contaminated  by 
the  guessing  parameter  cg  ,  and  neither  ag  nor  bg  has  a  meaning  by  itself. 

Since  Shiba’s  Data  are  empirical  data,  there  is  no  way  of  knowing  the  true  item  characteristic  function 
of  each  item.  With  simulated  data,  however,  we  can  produce  items  whose  true  item  characteristic 
functions  follow  the  normal  ogive  model.  If  we  assume  the  three-parameter  logistic  model  for  these 
test  items,  instead  of  the  normal  ogive  or  the  logistic  model  and  estimate  the  three  item  parameters 
simultaneously  by  using  Logist  5,  shall  we  obtain  the  estimate  of  cg  which  is  close  enough  to  the  true 
value,  zero,  and  the  estimates  of  ag  and  bg  which  are  close  enough  to  their  true  values?  Or  will  this 
additional  free  parameter  cg  contaminate  the  result  so  that  we  will  be  provided  with  a  substantially 
different  set  of  estimated  item  parameters? 

In  this  part  of  the  research  we  tried  Logist  5  on  a  set  of  simulated  data  and  pursued  this  issue.  This 
chapter  will  outline  its  method  and  results.  For  the  details  and  more  information,  see  [1.2.8]. 

[IX. l]  Simulated  Data 

Two  tests  were  hypothesized,  which  consist  of  ten  and  thirty-five  dichotonmous  items,  respectively, 
each  following  the  normal  ogive  model.  For  brevity,  we  shall  call  them  Ten  Item  Test  and  Thirty-Five 
Item  Test,  respectively.  They  were  used  separately,  and  also  together  as  a  test  of  forty-five  items,  and 
these  cases  are  called  Cases  1,  2  and  3,  respectively.  In  addition  to  these  three,  we  have  Case  4  of  eighty 
items.  This  last  case  was  created  rather  artificially  in  order  to  observe  the  results  based  upon  a  larger 
number  of  items. 

The  hypothesized  ability  distribution  is  uniform,  for  the  interval  of  ability  9  ,  (-2.5,  2.5).  Starting 
with  -2.475  and  ending  with  -2.475,  five  hypothetical  subjects  were  placed  at  each  of  the  one  hundred 
points  with  the  common  interval  width  of  0.050,  to  create  the  500  Subject  Case.  Later,  this  was 
repeated  three  times  more,  to  obtain  the  2,000  Subject  Case.  Monte  Carlo  Method  was  used  to  produce 
a  response  pattern  for  each  hypothetical  subject.  In  practice,  however,  each  item  of  the  Thirty-Five 
Item  Test  was  a  graded  item  having  two  difficulty  parameters.  It  was  redichotomized  by  using  the  first 
difficulty  parameter  only,  and  each  response  pattern  was  adjusted  accordingly.  Later,  when  Case  4  was 
created,  these  same  response  patterns  were  used  again  by  redichotomozing  each  item  using  the  second 
difficulty  parameter. 
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[IX. 2]  Method 

Logist  5  was  used  twice  for  each  combination  of  a  subject  group  and  a  set  of  items,  first  for  estimating 
the  three  parameters  ag  ,  bg  and  cg  ,  and  then  for  estimating  ag  and  bg  only  by  setting  cg  —  0  , 
as  we  did  in  analyzing  Shiba’s  Data.  This  means  that  we  assumed  the  three-parameter  logistic  model 
in  the  first  situation,  and  the  (two-parameter)  logistic  model  in  the  second. 

In  Logist  5,  the  origin  and  the  unit  of  ability  9  are  set  at  the  mean  and  the  standard  deviation  of 
its  maximum  likelihood  estimate  0,  for  all  subjects  whose  9,  are  within  the  interval  of  (-3.0,  3.0)  as 
the  result  of  the  last  iteration.  This  causes  some  problems,  because  in  so  doing  the  effect  of  the  error 
involved  in  9,  is  ignored,  and  also  the  excluded  subjects  affect  the  resulting  scale  in  each  case.  Since 
there  is  no  simple  way  to  make  the  scale  adjustment,  however,  these  scale  differences  were  not  adjusted 
in  the  research. 

The  theoretical  item  and  individual  parameters  were  transformed  in  order  to  make  them  comparable 
to  the  results  obtained  by  Logist  5.  Since  we  have 

(9.1)  E(9)  =  (a  +  /?)/2 

and 

(9.2)  V ar.(9)  (0  —  a)2/12 

for  the  uniform  distribution  with  the  interval  (a,  0)  ,  the  origin  of  9  was  kept  as  it  was,  and  the  unit 
was  made  1.443375673  times  larger. 

[IX. 3]  Results 

Tables  9-1  through  9-8  present  the  true  and  estimated  discrimination  and  difficulty  parameters  for 
the  eight  combinations  of  a  hypothetical  test  and  a  subject  group.  We  can  see  there  is  a  general  tendency 
of  the  enhancement  of  the  estimated  discrimination  parameters,  and  to  a  lesser  extent  of  the  estimated 
difficulty  parameters. 

This  enhancement  of  the  estimated  discrimination  parameters  is  more  revealing  if  we  plot  those 
values  against  the  theoretical  discrimination  parameters  ag  .  Figures  9-1  and  9-2  illustrate  those 
results  of  the  500  and  2,000  Subject  Cases,  respectively,  with  both  the  three-parameter  logistic  and 
the  logistic  models  assumed  for  Case  3.  In  each  of  these  graphs,  the  upper  limit  set  for  the  estimated 
discrimination  parameter  in  using  Logist  5  is  indicated  by  a  dotted,  horizontal  line.  Also  a  solid  line 
with  the  angle  of  45  degrees  from  the  abscissa  passing  (0,0)  is  drawn  in  each  graph.  We  can  see  in  these 
results  that,  although  there  is  some  improvement  in  those  obtained  by  setting  cg  =  0.0  in  Logist  5, 
many  discrimination  parameters  are  outrageously  overestimated  in  both  the  500  and  2,000  Cases.  This 
tendency  is  even  more  conspicuous  for  the  two  cases  of  smaller  number  of  test  items.  The  enhancement 
is  reduced  to  some  extent  in  the  results  of  Case  4,  especially  in  the  2,000  Subject  Case.  It  is  still 
substantial,  however. 

The  corresponding  results  on  the  difficulty  parameters  in  the  500  and  2,000  Subject  Cases  are 
illustrated  as  Figures  9-3  and  9-4,  respectively.  In  these  graphs,  the  interval  of  9,  (  —  \/3,  y/3)  ,  for 
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TABLE  9-1 

Theoretical  and  Estimated  Item  Parameters  in  the  500  and  2,000  Ten  Item  Test.  Three-Parameter 

Logistic  Model  Is  Assumed.  Case  1. 


Discrimination  Parameter 

D1  fflcul  ty 

— 

Parameter 

Guessing 

Parameter 

Theoretical 

- . - 

Estimated 

- , - 

Theoretical 

Estimated 

Estimated 

j 

Org. 

Adj. 

500 

S.C. 

2,000 

S.C. 

Org . 

Adj. 

500 

S.C. 

2,000 

S.C. 

500 

S.C. 

2,000 

S.C. 

1 

1 . 50000 

2.16506 

4.00000 

0.95615 

-2.50000 

-1.73205 

-7.00850 

-4.53097 

0.11111 

0.00000 

1  2 

1.00000 

1.44338 

1.12847 

1.89857 

-2.00000 

-1.38564 

-2.27705 

-1.25111 

0.11111 

0.23137  | 

3 

2 . 50000 

3.60844 

4.00000 

7.00000 

-1.50000 

-1.03923 

-1.49140 

-1.02797 

0.01285 

0.01897  1 

4 

1.00000 

1.44338 

1.49368 

2.16571 

-1.00000 

-0.69282 

-0.76647 

-0.34862 

0.12154 

0.15474 

5 

1 . 50000 

2.16506 

4.00000 

5.73384 

-0.50000 

-0.34641 

-0.39155 

-0.07764 

0.01397 

0.04219 

6 

1.00000 

1.44338 

1.81350 

2.45766 

0.00000 

0.00000 

-0.00846 

0.25474 

0.04756 

0.06358 

7 

2.00000 

2.88675 

3.08340 

6.30172 

0 . 50000 

0.34641 

0.32296 

0.49745 

0.00000 

0.00499 

8 

1.00000 

1.44338 

1.47734 

2.11129 

1.00000 

0.69282 

0.72304 

0.81349 

0.00000 

0.00645 

9 

2.00000 

2.88675 

4.00000 

5.73230 

1 . 50000 

1.03923 

1.10169 

1.04610 

0.00000 

0.00000 

10 

1.00000 

1.44338 

0.65759 

0.98121 

2.00000 

1 . 38564 

2.60342 

2.08713 

0.00000 

0.00000 

TABLE  9-2 

Theoretical  and  Estimated  Item  Parameters  in  the  500  and  2,000  Subject  Cases  for 
Each  Item  of  the  Ten  Item  Test.  Logistic  Model  Is  Assumed.  Case  1. 


Discrimination  Parameter 


Difficulty  Parameter 


Item  I  Theoretical 


Estimated 


Theoretical 


Estimated 


r: 

r. 

Org. 

,  i 

500  1 

S.C.  | 

2,000 

S.C. 

!  1 

Org. 

Adj. 

500 

S.C. 

- i 

2,000 
S.C.  j 

_ 1 

w  * 

i 

1.50000 

2.16506  1 

7.00000 

4.46012 

|  -2.50000 

-1.73205 

-1.79784 

-1.82777  ! 

1 

2 

1.00000 

1.44338 

1.66645 

1.48512 

!  -2.00000 

-1.38564 

-1.40424  : 

-1.41994  j 

3 

2.50000 

3.60844 

<!*  .83938 

4.77435 

1  -1.50000 

-1.03923 

-1.00147  j 

-0.98488 

! 

4 

1.00000 

1.44338 

1.38928 

1.55463 

-1.00000 

-0.69282 

-0.64011 

-0.63939 

1 

5 

1.S0000 

2.16506 

3.61872 

3.26944 

-0.50000 

-0.34641 

-0.18574  ! 

-0.25476 

*  * 

6 

1.00000 

t  1.44338 

1.53595 

1.60692 

0.00000 

0.00000 

0.11857 

0.10502 

7 

2.00000 

2.88675 

3.38480 

4.45453 

0.50000 

0.34641 

0.52054 

0.51174  , 

«"• 

8 

1  1.00000 

1  1.44338 

!  1.70183 

!  1.66029 

1.00000 

0.69282 

0.86021 

0.88896  ; 

.a 

9 

2.00000 

2.88675 

1  7.00000 

6.08234 

1.50000 

1.03923 

■  1.18334 

1.18419  ! 

L 

io  ! 

1.00000 

1.44338 

1.00086 

1.12968 

" 

2.00000 

1.38564 

2.09119- 

2.01825  i 

Theoretical 


Estimated 


1.80000 

1.90000 
2.00000 

1.50000 

1.60000 

1.40000 

1.90000 

1.80000 

1.60000 
2.00000 

1.50000 

1.70000 

1.50000 

1.40000 
2.00000 

1.60000 

1.80000 

1.70000 

1.90000 

1.70000 

1.50000 

1.80000 

1.40000 

1 . 90000 
2.00000 

1.60000 

1.70000 

1.40000 

1.90000 

1.60000 

1.50000 

1.70000 

1.80000 
2.00000 

1.40000 


2.59808 

2.74241 

2.88675 

2.16506 

2.30940 

2.02073 

2.74241 

2.59808 

2.30940 

2.88675 

2.16506 

2.45374 

2.16506 

2.02073 

2.88675 

2.30940 

2.59808 

2.45374 

2.74241 

2.45374 

2.16506 

2.59808 

2.02073 

2.74241 

2.88675 

2.30940 

2.45374 

2.02073 

2.74241 

2.30940 

2.16506 

2.45374 

2.59808 

2.88675 

2.02073 


7.00000 


7.00000 

2.34600 

1.66937 

5.95996 

4.54064 

3.60897 

3.72440 

5.93530 

2.15613 

4.30412 

3.08081 

3.41671 

3.73975 

3.55043 

3.26689 

2.51771 

3.36240 

2.49348 

3.92561 

4.36628 

2.61472 

2.54991 

1.97089 

6.12277 

4.30567 

2.50260 

7.00000 

3.18005 


2.10115 

4.95191 

2.01814 

6.25083 

2.45684 

2.62153 

3.93175 

3.57269 

5.79677 

4.14441 

2.50797 

3.82653 

3.00092 

3.29292 

3.15448 

3.43681 

2.97724 

2.68637 

3.58997 

2.33960 

3.63878 

3.48851 

2.72612 

2.90554 

2.25224 

3.14405 

3.39256 

2.18581 

6.00998 

1.91086 


7.00000 


4.75000 

4.50000 

4.25000 
4.00000 

3.75000 

3.50000 
3.00000 
3.00000 

2.75000 

2.50000 

2.25000 

2.00000 

1.75000 

1.50000 

1.25000 
1.00000 
0.75000 
0.50000 
0.25000 
0.00000 
0.25000 
0.50000 
0.75000 
1.00000 

1.25000 

1 . 50000 

1.75000 
2.00000 

2.25000 

2 . 50000 

2.75000 
3.00000 

3.25000 

3.50000 

3.75000 


-3.29090 

-3.11769 

-2.94449 

-2.77128 

-2.59808 

-2.42487 

-2.07846 

-2.07846 

-1.90526 

-1.73205 

-1.55885 

-1.38564 

-1.21244 

-1.03923 

-0.86603 

-0.69282 

-0.51962 

-0.34641 

-0.17321 

0.00000 

0.17321 

0.34641 

0.51962 

0.69282 

0.86603 

1.03923 

1.21244 

1.38564 

1.55885 

1.73205 

1.90526 

2.07846 

2.25167 

2.42487 

2.59808 
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TABLE  9-4 

Theoretical  and  Estimated  Item  Parameters  in  the  500  and  2,000  Subject  Cases  for 
Each  Item  of  the  Thirty-Five  Item  Test.  Logistic  Model  Is  Assumed.  Case  2. 


E 


& 


i 

r.J 
-  J 
.:■! 

i 


Discrimination  Parameter  Difficulty  Parameter 

- - i - , - 

Item  theoretical  Estimated  Theoretical  Estimated 


Org. 

Adj. 

500 

S.C. 

2,000 

S.C. 

Org. 

Adj. 

500 

S.C. 

2,000 

S.C. 

11 

1  1.80000 

2.59808 

... 

___  — 

-4.75000 

-3.29090 

_____ 

_  _  _ 

12 

1 . 90000 

2.74241 

— 

-4.50000 

-3.11769 

_ 

— 

13 

2.00000 

2.88675 

... 

— 

-4.25000 

-2.94449 

— 

14 

'  1.50000 

!  2.16506 

7.00000 

2.66184 

-4.00000 

-2.77128 

-2.24211 

-2.72393 

15 

1  1.60000 

2.30940 

... 

1.80462 

-3.75000 

-2.59808 

-3.48102 

16 

1.40000 

;  2.02073 

5.85000 

2.18952 

-3.50000 

-2.42487 

-2.27498 

-2.57504 

17 

1.90000 

1  2.74241 

2.65490 

4.80292 

-3.00000 

-2.07846 

-2.38002 

-1.98730 

18 

:  1.80000 

1  2.59808 

1.86159 

2.27740 

-3.00000 

-2.07846 

-2.43453 

-2.28213 

19 

1.60000 

2.30940 

3.04990 

2.35694 

-2.75000 

-1.90526 

-1.84166 

-1.94374 

20 

:  2.00000 

2.88675 

5.26700 

3.67092 

-2.50000 

-1.73205 

-1.69718 

-1.71751 

21 

1.50000 

2.16506 

3.68387 

3.53050 

-2.25000 

-1.55885 

-1.51712 

-1.48068 

22 

1.70000 

2.45374 

2.78792 

3.44789 

-2.00000 

-1.38564 

-1.41474 

-1.32688 

23 

1.50000 

|  2.16506 

2.29748 

3.17041 

-1.75000 

-1.21244 

-1.21515 

-1.15032 

24 

1.40000 

1  2.02073 

2.06001 

2.20858 

-1.50000 

-1.03923 

-0.98995 

-1.02123 

25 

2.00000 

,  2.88675 

4.28130 

3.59910 

-1.25000 

-0.86603 

-0.85496 

-0.86681 

26 

1 . 60000 

!  2.30940 

3.00410 

2.66034 

-1.00000 

-0.69282 

-0.63407 

-0.66631 

27 

;  1.80000 

2.59808 

2.96481 

3.16909 

-0.75000 

-0.51962 

-0.45806  , 

-0.52025 

28 

1 . 70000 

|  2.45374 

3.67673 

3.11533 

-0.50000 

-0.34641 

-0.30288  1 

-0.32330 

29 

1.90000 

2.74241 

3.46570 

3.44479 

-0.25000 

-0.17321 

-0.17130  I 

-0.15421 

30 

1 . 70000 

1  2.45374 

3.24069 

2.95857 

0.00000 

0.00000 

0.06386 

0.04162 

31 

'  1.50000 

2.16506 

2.48202 

2.63270 

0.25000 

0.17321 

0.24964 

0.17709 

32 

1.80000 

i  2.59808 

3.28184 

3.56525 

0.50000 

0.34641 

0.39818 

0.36913 

33 

1.40000 

1  2.02073 

2.44574 

2.33705 

0.75000 

0.51962 

0.50649 

0.49057 

34 

1.90000 

■  2.74241 

3.75665 

3.62024 

1.00000 

0.69282 

0.65520 

0.66711 

35 

'  2.00000 

2.88675 

4.35412 

3.52539 

1.25000 

0.86603 

0.79956  : 

0.86196 

36 

1.60000 

1  2.30940 

2.74792 

2.81144 

1 . 50000 

1.03923 

1.02172 

1.03020 

37 

!  1.70000 

2.45374 

2.63339 

3.03734 

1.75000 

1.21244 

1.21153  1 

1.22028 

38 

!  1.40000 

2.02073 

2.03766 

2.36148 

2.00000 

1.38564 

1.38059  1 

1.33170 

39 

1  1.90000 

2.74241 

3.39088 

3.39749 

2.25000 

1.55885 

1.45505  1 

1.50674 

40 

1.60000 

;  2.30940 

4.77467 

3.54667 

2 . 50000 

1.73205 

1.49452 

1.60441 

41 

;  1.50000 

1  2.16506 

2.58117 

2.27431 

2.75000 

1.90526 

2.07618  . 

1.93828 

42 

1.70000 

|  2.45374 

7.00000 

5.94718 

3.00000 

2.07846 

1.83612  1 

1.85210 

43 

;  1.80000 

!  2.59808 

2.91442 

1.95702 

3.25000 

2.25167 

2.15710 

2.63901 

44 

I  2.00000 

2.88675 

— 

_ 

3.50000 

2.42487 

— 

— 

45 

!  1.40000 

2.02073 

— 

4.38154 

3.75000 

2.59808 

— 

2.24655 

i 
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TABLE  9-5 


Theoretical  and  Estimated  Item  Parameters  in  the  500  and  2,000  Subject  Cases  for  Each  Item  of  the 
Ten  Item  Test  and  the  Thirty-Five  Item  Test.  Three-Parameter  Logistic  Model  Is  Assumed.  Case  3. 


Discrimination  Parameter 


Difficulty  Parameter 


Guessing 

Parameter 


Item  i  Theoretical 


Estimated 


Theoret1cal 


Estimated 


Estimated 


1.50000 
1.00000 

2.50000 
1.00000 

1.50000 
1.00000 
2.00000 
1.00000 
2.00000 
1.00000 

1.80000 

1.90000 
2.00000 

1.50000 

1.60000 

1.40000 

1.90000 

1.80000 

1.60000 
2.00000 

1.50000 

1.70000 

1 . 50000 

1.40000 
2.00000 

1.60000 

1.80000 

1 . 70000 

1.90000 

1.70000 

1.50000 

1.80000 

1.40000 

1.90000 
2.00000 

1.60000 

1.70000 

1.40000 

1.90000 

1.60000 

1.50000 

1.70000 

1.80000 
2.00000 

1.40000 


2.16506 

1.44338 

3.60844 

1.44338 

2.16506 

1.44338 

2.88675 

1.44338 

2.88675 

1.44338 

2.59808 

2.74241 

2.88675 

2.16506 

2.30940 

2.02073 

2.74241 

2.59808 

2.30940 

2.88675 

2.16506 

2.45374 

2.16506 

2.02073 

2.88675 

2.30940 

2.59806 

2.45374 

2.74241 

2.45374 

2.16506 

2.59808 

2.02073 

2.74241 

2.88675 

2.30940 

2.45374 

2.02073 

2.74241 

2.30940 

2.16506 

2.45374 

2.59808 

2.88675 

2.02073 


7.00000 

3.87013 

6.61387 

1.66601 

2.77616 

1.55749 

2.96359 

1.76403 

4.38452 

1.54466 


7.00000 

7.00000 

2.00167 

1.62591 

4.45727 

6.38769 

2.89572 

3.03416 

3.81430 

2.17074 

3.70917 

2.86861 

3.06301 

4.17905 

3.65756 

3.36662 

2.49791 

2.97532 

2.39944 

3.56410 

4.31324 

2.51338 

2.43304 

1.99664 

2.91426 

3.98169 

2.40751 

7.00000 

2.68843 


3.16857 

2.20593 

6.11440 

1.60446 

2.49209 

1.58121 

3.40269 

1.68448 

3.69650 

1.67432 


2.19806 
1.90897 
1 .84048 
5.60182 
2.21154 
2.36176 
3.87045 
3.06226 
4.41425 
3.36983 
2.43977 
3.62242 
2.96961 
3.14587 
3.18564 
3.30394 
2.92438 
2.60306 
3.17624 
2.24392 
3.65448 
3.29432 
2.56816 
2.70841 
2.25976 
3.11288 
3.20664 
2.03311 
6.06362 
1.73795 

7.00000 


-2.50000 
-2.00000 
-1.50000 
-1.00000 
-0 . 50000 
0.00000 
0.50000 
1.00000 

1.50000 
2.00000 
-4.75000 
-4.50000 
-4.25000 
-4.00000 
-3.75000 
-3.50000 
-3.00000 
-3.00000 
-2.75000 
-2.50000 
-2.25000 
-2.00000 
-1.75000 
-1.50000 
-1.25000 
-1.00000 
-0.75000 
-0.50000 
-0.25000 
0.00000 
0.25000 
0.50000 
0.75000 
1.00000 

1.25000 

1.50000 

1.75000 
2.00000 

2.25000 

2.50000 

2.75000 
3.00000 

3.25000 

3.50000 

3.75000 


-1.73205 

-1.38564 

-1.03923 

-0.69282 

-0.34641 

0.00000 

0.34641 

0.69282 

1.03923 

1.38564 

-3.29090 

-3.11769 

-2.94449 

-2.77128 

-2.59808 

-2.42487 

-2.07846 

-2.07846 

-1.90526 

-1.73205 

-1.55885 

-1.38564 

-1.21244 

-1.03923 

-0.86603 

-0.69282 

-0.51962 

-0.34641 

-0.17321 

0.00000 

0.17321 

0.34641 

0.51962 

0.69282 

0.86603 

1.03923 

1.21244 

1.38564 

1.55885 

1.73205 

1.90526 

2.07846 

2.25167 

2.42487 

2.59808 


500 

S.C. 

2,000 

S.C. 

500 

S.C. 

f 

2,000 

S.C. 

-1.58337 

-1.70186 

0.31911 

0.00000 

-0.82818 

-0.99213 

0.42360 

0.33765 

-0.88235 

-0.94942 

0.07876 

0.06904  , 

-0.57784 

-0.65341 

0.06874 

0.02886  ' 

-0.27650 

-0.33620 

0.00000 

0.00000  ! 

0.02360 

-0.00453 

0.01465 

0.00128  , 

0.36510 

0.36758 

0.00000 

0.00022 

0.65648 

0.69709 

0.00000 

0.00000  ! 

0.97398 

0.98698 

0.00000 

0.00000  . 

1.32856 

1.35974 

0.00000 

0.00000  1 

-2.37410  -, 


-2.37226 

-2.63823 

-2.57808 

-1.63765 

-1.73779 

-1.57782 

-1.33087 

-0.89510 

-0.87115 

-0.81486 

-0.59335 

-0.41924 

-0.27372 

-0.15748 

0.05765 

0.24144 

0.38466 

0.49468 

0.64140 

0.78225 

1.01267 

1.20847 

1.37635 

1.47260 

1.51130 

2.12160 

1.86710 

2.20918 


-2.95371 
-3.36769 
-2.75675 
-1.99530 
-2.29576 
-1.92579 
-1.68322 
-1.50616 
-1.18774 
-1.05681 
-0.92907 
-0.83289 
-0.60991 
-0.50798 
-0.31965 
■0.15832  j 
0.03296 
0.16800  l 
0.35806  I 
0.48257  | 
0.65986 
0.85445 
1.02876 
1.22648 
1.33734 
1.52379 
1.63052 
2.00855 
1.88062 
2.79685 

2.19260 


0.08826 

0.08826 

0.08826 

0.08826 

0.43106 

0.00000 

0.00000 

0.14331 

0.28760 

0.09159 

0.00000 

0.00000 

0.00089 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 


0.10234 

0.10234 

0.10234 

0.00000 

0.10234 

0.10234 

0.14877 

0.00000 

0.20630 

0.10975 

0.08490 

0.01472 

0.03712 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00198 
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Theoretical  and  Estimated  Item  parameters  in  the  500  and  2,000  Subject 
Cases  for  Each  Item  of  the  Ten  Item  Test  and  the  Thirty-Five  Item  Test. 
Logistic  Model  Is  Assumed.  Case  3. 
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is 


:  I 

Discrimination  Parameter  Difficulty  Parameter  I 


, 

Item  |  Theoretical  Estimated  Theoretical  1  Estimated 


Org. 

Adj. 

500 

s.c. 

2,000 

S.C. 

Org. 

Adj . 

500 

S.C. 

2,000 

S.C. 

I 

1 . 50000 

2.16506 

4.95337 

3.28161 

-2.50000 

-1.73205 

-1.67240 

‘  "  1 

-1.67029 

1  2 

1.00000 

1.44338 

1.69224 

1.57696 

-2.00000 

-1.38564 

-1.36571 

-1.35902 

i  3 

2.50000 

3.60844 

4.07728 

4.17736 

-1.50000 

-1.03923 

-1.01281 

-1.02694 

i  4 

1.00000 

1.44338 

1.46003 

1.53778 

-1.00000 

-0.69282 

-0.70019 

-0.70004 

i  5 

1.50000 

2.16506 

2.68987 

2.47682 

-0.50000 

-0.34641 

-0.30131 

-0.34419 

!  6 

1.00000 

1.44338 

1.44917 

1.55961 

0.00000 

0.00000 

-0.01392 

-0.00994 

7 

2.00000 

2.88675 

2.86243 

3.32482 

0.50000 

0.34641 

0.36627 

0.36951 

!  8 

1.00000 

1.44338 

1.72479 

1.67575 

1.00000 

0.69282 

0.66329 

0.70095 

1  9 

2.00000 

2.88675 

4.34492 

3.69979 

1.50000 

1.03923 

0.98918 

0.99343 

1  10 

1.00000 

1.44338 

1.53441 

1.68119 

2.00000 

1.38564 

1.34230 

1.36275 

1  11 

1.80000 

2.59808 

— 

— 

-4.75000 

-3.29090 

... 

— 

i  12 

1.90000 

2.74241 

— 

— 

-4.50000 

-3.11769 

... 

— 

:  13 

2.00000 

2.88675 

— 

— 

-4.25000 

-2.94449 

... 

— 

1  14 

1.50000 

2.16506 

7.00000 

2.91928 

-4.00000 

-2.77128 

-2.15752 

-2.60081 

15 

1.60000 

2.30940 

... 

1.74182 

-3.75000 

-2.59808 

... 

-3.53483 

1  16 

1.40000 

2.02073 

7.00000 

2.09438 

-3.50000 

-2.42487 

-2.15752 

-2.60017 

j  17 

1.90000 

2.74241 

2.73182 

5.71243 

-3.00000 

-2.07846 

-2.32255 

-1.91514 

i  18 

1.80000 

2.59808 

1.93571 

2.41736 

-3.00000 

-2.07846 

-2.37937 

-2.22083 

19 

1.60000 

2.30940 

2.78781 

2.41276 

-2.75000 

-1.90526 

-1.85334 

-1.92091 

20 

2.00000 

2.88675 

7.00000 

3.84414 

-2.50000 

-1.73205 

-1.66676 

-1.69924 

21 

1.50000 

2.16506 

3.05070 

3.28211 

-2.25000 

-1.55885 

-1.54605 

-1.48938 

22 

1.70000 

2.45374 

2.81049 

3.11682 

-2.00000 

-1.38564 

-1.42047 

-1.33659 

23 

1.50000 

2.16506 

2.18879 

2.84317  ’ 

-1.75000 

-1.21244 

-1.22508 

-1.15676 

24 

1.40000 

2.02073 

1.98367 

2.16013 

-1.50000 

-1.03923 

-0.99519 

-1.02270 

25 

2.00000 

2.88675 

3.78513 

3.38927 

-1.25000 

-0.86603 

-0.85820 

-0.86525 

26 

1.60000 

2.30940 

2.79550 

2.57449 

-1.00000 

-0.69282 

-0.63042 

-0.66551 

27 

1.80000 

2.59808 

2.94721 

3.13646 

-0.75000 

-0.51962 

-0.45199 

-0.51928 

28 

I . 70000 

2.45374 

4.01387 

3.15357 

-0.50000 

-0.34641 

-0.29894 

-0.32714 

29 

1.90000 

2.74241 

3.47887 

3.27011 

-0.25000 

-0.17321 

-0.17677 

-0.16301 

30 

1 . 70000 

2.45374 

3.26431 

2.89248 

0.00000 

0.00000 

0.04783 

0.03087 

31 

1.50000 

2.16506 

2.41301 

2.57220 

0.25000 

0.17321 

0.23779 

0.16763 

32 

1.80000 

2.59808 

2.87394 

2.12995 

0.50000 

0.34641 

0.38646 

0.36035 

33 

1.40000 

2.02073 

2.34406 

2.22892 

0.75000 

0.51962 

0.49893 

0.48570 

34 

1.90000 

2.74241 

3.47064 

3.61903 

1.00000 

0.69282 

0.65094 

0.66576 

35 

2.00000 

2.88675 

4.20559 

3.28752 

1.25000 

0.86603 

0.79511 

0.86083 

36 

1.60000 

2.30940 

2.49622 

2.58101 

1.50000 

1.03923 

1.02635 

1.03417 

37 

1 . 70000 

2.45374 

2.42736 

2.72468 

1.75000 

1.21244 

1.22281 

1.23107 

38 

1.40000 

2.02073 

1.99253 

2.27881 

2.00000 

1.38564 

1.39037 

1.34037 

39 

1 .90000 

2.74241 

2.94510 

3.18444 

2.25000 

1.55885 

1.48530 

1.52314 

40 

1.60000 

2.30940 

3.99991 

3.23204 

2.50000 

1.73205 

1.52512 

1.63091 

41 

1 . 50000 

2.16506 

2.40032 

2.03234 

2.75000 

1.90526 

2.13695 

2.01093 

42 

l . 70000 

2.45374 

7.00000 

6.04216 

3.00000 

2.07846 

1.87981 

1.87873 

43 

1.80000 

2.59808 

2.70006 

1.73892 

3.25000 

2.25167 

2.22019 

2.79913 

44 

2.00000 

2.88675 

.... 

3.50000 

2.42487 

— 

45 

1.40000 

2.02073 

— 

4.00385 

3.75000 

2.59808 

— 

2.31592 
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FIGURE  9-1 


Estimated  Discrimination  Parameter  Obtained  by  Logist  5  Plotted  against  the  TVue  Discrimination 
Parameter  for  Each  Item  of  the  Ten  Item  Test  (f$  and  of  the  Thirty-Five  Item  Test  (4).  Case  3, 
500  Subject  Case.  Three-Parameter  Logistic  Model  Is  Assumed. 


Estimated  Difficulty  Parameter  Obtained  by  Logist  5  Plotted  against  the  TYue  Difficulty  Parameter  b„  for  Each  Item  of  the 
Ten  Item  Test  (*)  and  of  the  Thirty-Five  Item  Test  (♦).  Case  3,  500  Subject  Case.  Three-Parameter  Logistic  Model  Is  Assumed 
(Left).  Guessing  Parameter  cg  Is  Set  Equal  to  Zero,  i.e.,  Logistic  Model  Is  Assumed  (Right).  Linear  Regression  of  9  on  8  Is 

Plotted  by  Dots  for  Reference. 


Reduced  Scatter  Diagram  of  the  Theoretical  and  Estimated  Item  Discrimination 
Parameters  Obtained  by  Excluding  All  Items  Whose  Theoretical  Item  Difficulty 
Parameters  Are  Outside  the  Interval  (-v/3,  \/3)  .  These  Excluded  Items  Are 
Plotted  by  Hollow  Shapes.  Case  4,  2,000  Subject  Case. 


Estimated  Individual  Parameters  Plotted  against  the  Theoretical  Individual  Parameters.  Three-parameter  Logistic  Model  (Left 
Graph)  and  Logistic  Model  (Right  Graph)  Are  Assumed,  Respectively.  Case  3,  500  Subject  Case.  Linear  Regression  of  6  on  9 

Is  Plotted  by  a  Dotted  Line. 
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FIGURE  9-7 

Estimated  Individual  Parameters  Plotted  against  the  Theoretical  Individual  Parameters.  Three-parameter  Logistic  Model  (Left 
Graph)  and  Logistic  Model  (Right  Graph)  Are  Assumed,  Respectively.  Case  3,  2,000  Subject  Case.  Linear  Regression  of  §  on 

6  Is  Plotted  by  a  Dotted  Line. 


which  the  ability  distribution  is  uniform,  is  indicated  by  two  solid,  vertical  lines.  A  reasonably  good 
agreement  is  observed  between  the  estimated  and  the  theoretical  difficulty  parameters  for  the  subset  of 
items  whose  theoretical  difficulty  parameters  are  within  the  interval,  (— v/3,  \/3)  .  Some  improvement 
is  observed  in  the  results  obtained  by  assuming  the  logistic  model  compared  with  those  obtained  by 
assuming  the  three-parameter  logistic  model,  both  in  the  500  Subject  Case  and  in  the  2,000  Case,  in 
each  of  the  Cases  1,  2,  3  and  4. 

Figure  9-5  presents  the  estimated  item  discrimination  parameters  of  Case  4  plotted  against  the  true 
discrimination  parameters.  In  this  figure,  all  items  whose  difficulty  parameters  are  outside  the  range  of 
(-v/3,>/3)  are  plotted  by  hollow  shapes.  Case  4  has  produced  the  best  agreement  of  the  four  cases  in 
each  situation,  and  we  can  see  that  agreement  is  further  improved  by  excluding  these  hollow  shapes.  A 
substantial  enhancement  of  the  estimated  discrimination  parameters  still  exists,  however.  The  resulting 
estimated  item  characteristic  functions  are  all  compared  to  the  theoretical  curves  in  graphs  in  [1.2.8]. 

Figures  9-6  and  9-7  illustrate  the  estimated  individual  parameters  8,  plotted  against  the  true  values 
9,  for  the  500  and  2,000  Subject  Cases,  respectively,  in  Case  3  where  we  have  thirty-test  items. 

[IX. 4]  Discriminating  Shrinkage  Factor  and  Difficulty  Reduction  Index 

It  has  been  observed  that  the  enhancement  of  the  estimated  discrimination  parameter  exists  in  both 
situations  where  we  set  the  third  parameter  cg  free  and  at  zero  in  Logist  5,  respectively.  This  indicates 
the  effect  of  scaling  problem  in  Logist  5  where  the  standard  deviation  of  the  maximum  likelihood 
estimate  of  8,  ,  instead  of  that  of  8,  ,  is  defined  as  the  unit.  Since  the  standard  deviation  of  8,  is 
expected  to  be  larger  than  9,  for  the  kind  of  data  such  as  ours,  the  estimated  discrimination  parameters 
are  expected  to  be  enhanced,  and  the  estimated  difficulty  parameters  are  expected  to  be  regressed,  than 
what  they  actually  are. 

It  has  also  been  observed  that  the  enhancement  of  the  estimated  discrimination  parameter  tends 
to  be  larger  in  the  situation  where  three-parameter  logistic  model  is  assumed,  in  comparison  with  the 
situation  where  cg  is  set  equal  to  zero.  This  fact  suggests  that  there  exists  the  enhancement  of  the 
estimated  discrimination  parameter  caused  by  the  third  parameter  cg  .  It  is  also  suggested  from  the 
estimated  difficulty  parameters  that  the  enhancement  of  the  estimated  difficulty  parameter  will  also 
exist  when  three-parameter  logistic  model  is  assumed,  if  an  appropriate  scale  adjustment  of  8  is  made. 

For  these  reasons,  the  principal  investigator  proposed  the  discrimination  shrinkage  factor  and  the 
difficulty  reduction  index,  following  certain  rationale.  They  are  given  by 


(9.3) 


[?K)1  1  =  [log(l  —  c^)  —  log(l  +  c*)][log(l  —  2c( 


•l-i 


and 


(9.4) 


£Klao)  =  (£>3tf)  Mlo«(1  +  c»)  —  lo«(i  —  - 


where  cg  is  the  estimated  positive  third  parameter  when  it  actually  should  equal  zero.  Thus  we  can 
revise  the  estimated  discrimination  parameter  a*  by  setting 
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FIGURE  9-8 


Estimated  Discrimination  Parameters  Which  Were  Revised  by  the  Discrimination 
Shrinkage  Factor  Plotted  against  the  Theoretical  Discrimination  Parameters. 
Three- Parameter  Logistic  Model  Is  Assumed.  Case  3,  2,000  Subject  Case. 
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FIGURE  9-9 

Estimated  Difficulty  Parameters  Which  Were  Revised  by  the  Difficulty  Reduction  Index  Plotted 
against  the  Theoretical  Difficulty  Parameters.  Three-Parameter  Logistic  Model  Is  Assumed. 

Case  3,  2,000  Subject  Case. 
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and  the  estimated  difficulty  parameter  i*  by 


K  =  K-  S{c;  I  ag) 


Figures  9-8  and  9-9  illustrate  the  results  of  these  revisions  for  the  2,000  Subject  Case  in  Case  3.  Com¬ 
parison  of  these  figures  with  Figures  9-2  and  9-4  indicates  the  effects  of  these  revisions. 

[IX.  5]  Discussion 

The  present  research  disclosed  the  danger  of  accepting  the  results  of  Logist  5  without  modifications. 
It  is  the  principal  investigator’s  wish  that  researchers  become  aware  of  the  danger,  and  think  twice 
before  accepting  the  results  as  they  are. 

Again  in  this  chapter  only  a  small  part  of  the  research  was  presented  because  of  shortage  of  space. 
There  are  many  more  interesting  findings  and  observations  contained  in  [1.1.8]. 


V.V.'I- 
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X  Discussion  and  Conclusions 


In  writing  this  final  report,  the  principal  investigator  has  found  it  extremely  difficult  to  summarise 
and  integrate  all  the  different  approaches  and  findings  of  the  research.  They  include:  (1)  basically 
theoretical  works  such  as  the  proposal  of  two  new  latent  trait  models  (cf.  Chapters  V  and  VI),  that  of 
the  MLE  bias  function  for  general  discrete  responses  (cf.  Chapter  111),  constancy  in  item  information 
and  the  information  loss  caused  by  noise  (cf.  Chapter  IV),  etc.,  2)  combinations  of  theoretical  and 
methodological  works  including  a  substantial  amount  of  computer  programming  such  as  the  estimation 
of  the  operating  characteristics  of  discrete  item  responses  (cf.  Chapter  II),  etc.,  3)  basically  empirical 
studies  such  as  the  analyses  of  the  Iowa  Data  and  that  of  Shiba’s  data  (cf.  Chapters  VII  and  VIII),  and 
4)  basically  a  simulation  study  like  the  observations  of  the  results  of  Logist  5  (cf.  Chapter  IX).  The 
above  is  a  rough  categorization,  for  the  contents  are  overlapping.  To  give  some  examples,  the  proposal 
of  the  discrimination  shrinkage  factor  and  the  difficulty  reduction  index  in  Chapter  IX  are  theoretical 
works,  the  plausibility  function  of  a  distractor  observed  in  Chapter  VII  has  something  to  do  with  the 
Informative  Distractor  Model,  and  all  theoretical  works  have  some  empirical  studies  or  their  prospects 
involved. 

The  principal  investigator  believes  that  all  these  different  approaches  are  essential  to  the  advancement 
of  latent  trait  theory,  and  are  the  reason  for  the  fruitfulness  of  the  present  research  project.  She  regrets, 
however,  that  she  had  to  leave  out  many  other  interesting  findings  and  observations  from  this  final 
report,  because  of  the  shortage  of  space  and  of  the  difficulty  in  summarizing  all  of  them.  She  hopes  that 
readers  of  this  final  report  who  have  got  interested  in  particular  topics  will  read  the  original  research 
reports  and/or  conference  handouts. 

The  principal  investigator  also  believes  that  she  has  accomplished  something  during  this  research 
period,  in  line  with  the  proposed  objectives  for  the  advancement  of  latent  trait  theory.  At  the  end  of 
this  research  project,  she  would  like  to  express  her  gratitude  to  the  Office  of  Naval  Research  for  this 
research  opportunity. 
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APPENDIX  A 


Contents  of  Advanced  Seminar  on  Latent  TYait  Theory  (1982) 

I  Estimation  of  the  Operating  Characteristics  of  the  Discrete  Item 
Responses  and  That  of  Ability  Distributions:  I 

(1.1)  Relationship  between  the  Estimation  of  the  Operating  Characteristics  and  That  of 
Ability  Distributions 

(1.2)  No  Mathematical  Forms  are  Assumed  for  the  Operating  Characteristics  of  the  Unknown 
Test  Items 

(1.3)  Small  Number  of  Examinees  in  the  Calibration  Data 

(1.4)  Old  Test 

(1.5)  Set  of  Five  Hundred  Maximum  Likelihood  Estimates 

(1.6)  Unknown  Test  Items  Whose  Operating  Characteristics  Are  to  Be  Estimated 

(1.7)  Use  of  Robust,  Indirect  Information 

(1.8)  Transformation  of  Ability  S  to  r 

II  Method  of  Moments  As  the  Least  Squares  Solution  for  Fitting  a 
Polynomial 

(II.  1)  Approximation  to  the  Density  Function  from  a  Set  of  Observations 

(11. 2)  Method  of  Moments  As  the  Least  Squares  Solution  for  Fitting  a  Polynomial 

(11.3)  Direct  Use  of  the  Least  Squares  Solution 

(11.4)  Solution  by  the  Method  of  Moments 

(11.5)  Expanded  Use  of  the  Method  of  Moments 

(11.6)  Selection  of  the  Interval 

(11.7)  Comparison  of  the  Results  Obtained  by  the  Method  of  Moments  and  by  the  Direct 
Least  Squares  Procedure 

III  Estimation  of  the  Operating  Characteristics  of  the  Discrete  Item 
Responses  and  That  of  Ability  Distributions:  II 

(III.  1)  Estimated  Operating  Characteristics  Which  Are  Directly  Observable  from  Our 
Calibration  Data 

(111. 2)  Necessary  Correction  for  the  Scale  of  the  Maximum  Likelihood  Estimate  When  Used 
As  a  Substitute  for  the  Ability  Scale 

(111. 3)  Transformation  of  6  to  r  Using  the  Method  of  Moments  for  Fitting  a  Polynomial 

(111. 4)  Classification  of  Methods  and  Approaches 

(111. 5)  Normal  Approximation  Method 
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(III. 6)  Approximation  to  the  Density  Function  of  the  Maximum  Likelihood  Estimate  by  a 
Polynomial  Obtained  by  the  Method  of  Moments 


(111. 7)  Pearson  System  Method 

(111. 8)  Two-Parameter  Beta  Method 


(III.9)  Normal  Approach  Method 


(III.  10)  Bivariate  P.D.F.  Approach 


(III. 11)  Histogram  Ratio  Approach 


(III.  12)  Curve  Fitting  Approach 
(III.  13)  Conditional  P.D.F.  Approach 

(III. 14)  Remark  on  the  Approximation  of  4>[t  j  f)  by  Normal  Density  Function 


IV  Estimation  of  the  Operating  Characteristics  of  the  Discrete  Item 
Responses  and  That  of  Ability  Distributions:  III 

(IV. 1)  Objective  Testing  and  Exchangeability 
(IV. 2)  Every  Test  Has  a  Limitation 

(IV. 3)  Alternative  Estimators  for  the  Maximum  Likelihood  Estimator 
(IV.4)  Bayes  Estimator  with  a  Uniform  Density  as  the  Prior 
(IV. 5)  Subtest  3 

(IV.6)  Nine  Subtests  As  Our  Old  Test 
(IV.7)  Sample  Linear  Regression  of  f,  on  t, 

(IV. 8)  Polynomial  Approximation  to  the  Density  Function,  g(f) 

(IV.9)  Estimated  Item  Characteristic  Functions  Obtained  upon  Subtests  1,  2  and  3 
(IV. 10)  Estimated  Item  Characteristic  Functions  Obtained  upon  the  Six  Other  Subtests 


Adaptive  Testing 

(V.l)  Addition  of  New  Test  Items  to  the  Item  Pool 
(V.2)  Weakly  Parallel  Tests 


(V.3)  Use  of  the  Amount  of  Test  Information  as  the  Criterion  for  Terminating  the 
Presentation  of  New  Test  Items 


(V.4)  Test  Information  Function  and  Standard  Error  of  Estimation 

(V.5)  Old  Test  for  Item  Calibration 

(V.6)  Adaptive  Testing  Using  Graded  Test  Items 

(V.7)  Bayesian  vs.  Maximum  Likelihood  Estimation  in  Adaptive  Testing 


VI  Constant  Information  Model 


(VI.  1)  Constancy  of  Information  under  the  Transformation  of  the  Latent  Trait 


(VI. 2)  Constancy  of  Item  Information  for  a  Specified  Model 


(VI. 3)  Constancy  of  Item  Information  for  a  Set  of  Models 


(VI. 4)  Exact  Area  under  the  Square  Root  of  the  Item  Information  Function 
(VI. 5)  Constant  Information  Model 


(VI. 6)  Use  of  Constant  Information  Model  for  a  Set  of  Equivalent  Test  Items  Which 
Substitutes  for  the  Old  Test 


(VI. 7)  How  to  Detect  a  Subset  of  Equivalent  Binary  Items 


(VI. 8)  Convergence  of  the  Conditional  Distribution  of  the  Maximum  Likelihood  Estimate  to 
the  Asymptotic  Normality  When  a  Test  Consists  of  Equivalent  Items 


VII  A  New  Family  of  Models  for  the  Multiple-Choice  Test  Item:  I 
(VII. 1)  Mathematical  Models  and  Psychological  Reality 
(VII.2)  Three- Parameter  Logistic  Model 
(VII. 3)  Tokyo  Research 
(VII.4)  Sato’s  Index  k 

(VII.5)  Index  k*  for  the  Validation  Study  of  the  Three- Parameter  Logistic  Model 
(VII.6)  Simulation  Study  on  Index  k* 

(VII. 7)  Iowa  Tests  of  Basic  Skills 
(VII. 8)  Original  and  Revised  Iowa  Data 
(VII.9)  Informative  Distractor  Model 
(VII.  10)  Equivalent  Distractor  Model 

(VII. 11)  Index  k*  for  the  Invalidation  of  the  Equivalent  Distractor  Model 
(VII.  12)  Results  Obtained  by  Using  Index  k*  on  Iowa  Data 


(VII.  13)  Comparison  of  the  Results  on  Common  Test  Items  for  Three  Levels  of  Examinees  in 
Iowa  Study 


(VII. 14)  Remarks  on  the  Usage  of  Index  k* 


VIII  New  Family  of  Models  for  the  Multiple-Choice  Test  Item:  II 

(VIII.  1)  Shiba’s  Word  Comprehension  Tests 

(VIII. 2)  Subjects  Used  in  Shiba’s  Research 

(VIII. 3)  Methods  and  Results  of  Shiba’s  Research 

(VIII. 4)  Detractors  As  Resources  of  Information 

(VIII. 5)  Mathematical  Models  in  Physics  and  in  Psychology 

(VIII.6)  Normal  Ogive  Model  on  the  Graded  Response  Level  and  Bock’s  Multinomial  Model 
(VIII.7)  A  New  Family  of  Models  for  the  Multiple-Choice  Test  Items 
(VIII.8)  Basic  Functions  and  Information  Functions  of  the  New  Models 
(VIII.9)  Instructions  and  Mathematical  Models 
(VIII.  10)  A  New  Approach  to  Data  Analysis 

IX  Conclusions 
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Overview  of  Latent  Trait  Models:  Paper  Presented  at  the  Fifteenth  Annual  Meeting  of  the  Behaviormetric 
Society  of  Japan,  in  August,  1987,  at  Kyushu  University,  Fukuoka,  Japan. 


Psychology  has  its  own  unique  problem  of  measuring  hypothetical  constructs,  such  as  ability,  at¬ 
titude,  motivation,  personality,  and  so  forth.  Since  they  are  hypothesised  constructs,  there  is  no  way 
to  measure  them  directly,  and  indirect  measurement  through  individuals’  responses  to  more  or  less 
concrete  entities,  called  items,  serves  an  important  role.  Thus,  latent  trait  models  have  been  developed 
for  measuring  hypothetical  constructs,  especially  in  the  framework  of  mental  test  theory,  as  well  as  in 
social  attitude  measurement. 

Let  6  be  the  latent  trait,  which  can  be  defined  either  in  the  unidimensional  latent  space  or  in 
the  multidimensional  latent  space.  Let  g  denote  the  item,  or  the  smallest  entity,  responses  to  which 
enable  us  to  measure  the  latent  trait  indirectly.  In  mental  measurement,  for  example,  the  latent  trait 
may  be  a  specified  mathematical  ability,  and  an  item  is  a  specific  question  presented  in  a  mathematics 
test;  in  social  attitude  measurement,  the  latent  trait  may  be  the  attitude  toward  a  specific  social 
issue,  and  items  are  questions  incorporated  in  a  questionnaire  specifically  developed  for  the  purpose. 
A  typical  research  may  start  with  data  collection  based  upon  n  items  which  have  been  developed 
for  the  purpose  of  measuring  a  specific  hypothetical  construct,  or  latent  trait,  and  the  specification  of 
individual  differences  among  human  subjects  with  respect  to  the  specified  latent  trait  may  be  at  the 
etd  of  the  research. 

Unlike  many  other  researchers  who  work  in  the  area  of  latent  trait  theory,  or  item  response  theory, 
my  interest  does  not  stay  solely  on  the  dichotomous  response  level,  on  which  the  item  score  assumes 
either  0  or  1  Years  ago,  for  example,  I  developed  graded  response  models  (Samejima,  1969,  1972), 
which  deal  with  discrete  item  responses  having  more  than  two  item  score  categories,  i.e.,  0, 1, . . . ,  mg  , 
for  item  g  .  A  general  latent  trait  model  for  the  homogeneous  case  of  the  continuous  response  level, 
which  deals  with  items  having  continuous  item  responses,  was  also  proposed  (Samejima,  1973);  and 
later,  the  normal  ogive  model  was  expanded  to  fit  the  multidimensional  latent  space  (Samejima,  1974). 
A  direct  expansion  of  the  general  model  for  the  continuous  item  response  leads  us  to  the  situation  in 
which  the  conditional  distribution  of  the  item  score,  given  the  latent  trait  9  ,  allows  to  be  partly 

continuous  and  partly  discrete.  This  general  model  is  for  the  unidimensional  latent  space,  and  includes 
four  different  situations,  i.e.,  the  closed  response  situation,  the  closed/open  response  situation,  the 
open/closed  response  situation,  and  the  open  response  situation. 

The  operating  characteristic  of  each  discrete  response,  or  the  operating  density  characteristic  of 
each  continuous  response,  plays  an  important  role  in  latent  trait  theory.  The  former  is  the  conditional 
probability  of  the  specific  discrete  response,  given  the  latent  trait.  If,  for  instance,  each  question  in 
a  vocabulary  test  is  scored  either  correct  or  incorrect,  then  the  situation  belongs  to  the  dichotomous 
response  level,  which  is  a  subcategory  of  the  discrete  response  level.  Thus  the  binary  item  score, 
uB  (=  0,  1)  ,  is  assigned  to  each  item  response.  The  item  characteristic  function  P„{9)  of  a  dichotomous 
item  g  is  defined  as 

(1)  Pg(e)  =  ProbK  =  1  I  9\  , 


V, 

y 

V 


v 

t. 


i.e.,  the  conditional  probability  for  the  correct  answer  to  item  g  ,  given  ability  9  (cf.  Lord  and  Novick, 
1968).  Figure  1  illustrates  typical  monotone  increasing  item  characteristic  functions.  In  this  example, 
these  two  dichotomous  items  follow  the  normal  ogive  model,  whose  item  characteristic  function  can  be 
written  in  the  form 

ra,{e-b,) 

(2)  Pg{8)  =  (2*)1/2  /  e~u/2du 
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with  the  item  discrimination  parameter  ag  (>  0)  and  item  difficulty  parameter  bg  .  Note  that,  in 
the  limiting  situation  where  ag  approaches  positive  infinity,  this  item  characteristic  function  tends  to 
a  step  function  which  is  illustrated  in  Figure  1.  This  degenerated  item  characteristic  function  belongs 
to  the  deterministic  model  known  as  Guttman  Scale,  or  as  biorders. 


If  each  question  in  a  questionnaire  is  answered  by  checking  one  of  the  four  categories,  i.e.,  strongly 
disagree,  disagree,  agree,  and  strongly  agree,  as  is  shown  in  Figure  2,  then  the  situation  belongs  to  the 
graded  response  level.  This  is  also  a  subcategory  of  the  discrete  response  level  (cf.  Samejima,  1969, 
1972).  Thus  the  graded  item  score,  xg  (=  0, 1, . . . ,  mg)  ,  is  assigned  to  each  item  response,  and  in  this 
specific  example  mg  =  3  .  The  operating  characteristic,  Px,{9)  >  of  the  graded  item  score  xg  is  defined 
as  the  conditional  probability  assigned  to  xg  ,  given  9  .  In  the  normal  ogive  model,  for  example,  this 
operating  characteristic  is  given  by 


(3) 


ra,(9-b,  ) 

Px,(9)  =  (2ir)~^  I  e-“/2du, 

J  ag  —  + 1 ) 


where  ag  (>  0)  is  the  item  discrimation  parameter  and  6Xj  is  the  item  response  difficulty  parameter, 
which  satisfies 

(4)  -  oo  =  b0  <  bi  <  . . .  <  6m>.i  <  bmg  <  fcm„+ 1  =  oo  . 


Figure  3  illustrates  a  set  of  operating  characteristics  in  the  normal  ogive  model  on  the  graded  response 
level  with  ag  =  1.00  ,  =  —1.50  ,  4 2  =  —0.50  ,  63  =  0.00  ,  64  =  0.75  and  65  =  1.25  . 

An  interesting  application  of  the  graded  response  model  was  made  by  Roche,  Wainer,  and  Thissen 
in  the  area  of  medical  science  (Roche,  Wainer,  and  Thissen,  1975).  In  their  research,  they  developed  a 
skeletal  maturity  scale,  using  the  knee  joint  as  the  biological  indicator.  There  are  thirty-four  indicators, 
or  items,  and  th.  wjgh  the  X-ray  films  of  the  subject’s  knee  joint  each  item  was  scored  by  experts  into  two 
to  five  graded  categories.  Since  the  subject’s  skeletal  age  does  not  always  coincide  with  his  chronological 
age,  the  skeletal  maturity  scale  based  upon  latent  trait  theory  is  meaningful. 

When  the  item  response  is  continuous,  the  operating  density  characteristic  of  the  continuous  item 
score  must  be  considered.  Let  zg  be  the  continuous  response.  Without  loss  of  generality,  we  can  define 
zg  as  the  set  of  real  numbers  between  zero  and  unity.  In  the  open  response  situation,  it  is  assumed 
that  the  conditional  probability  of  zg  ,  given  the  latent  trait  9  ,  is  zero  for  any  value  of  zg  ,  including 
the  two  endpoints,  and  the  conditional  distribution  of  zg  is  given  as  a  continuous  distribution.  As 
an  example,  let  us  consider  the  response  format  given  as  Figure  4.  If  the  subject  is  asked  to  respond 
to  a  given  statement  by  checking  any  point  in  the  line  segment  shown  in  Figure  4,  except  for  the  two 
endpoints,  then  it  will  be  reasonable  to  assume  that  the  conditional  probability  of  any  particular  point 
in  the  line  segment  is  zero,  for  any  fixed  6  :  thus  the  open  response  situation  occurs.  On  the  other 
hand,  if  the  subject  is  also  allowed  to  check  either  one  of  the  endpoints,  then  it  will  be  unreasonable  to 
assume  that  the  conditional  probability  of  each  endpoint  ,  i.e.,  zg  =  0  or  zg  =  1  ,  is  zero,  for  people 
tend  to  check  either  endpoint  more  often  than  any  other  point:  thus  the  closed  response  situation 
occurs,  and  the  conditional  distribution  of  zg  ,  given  9  ,  is  continuous  for  0  <  zg  <  1  ,  and  discrete 
at  zg  =  0  and  zg  =  1  . 

We  notice  that  in  most  experimental  situations  where  response  latency  is  used  as  a  measure  of 
“quickness”  of  information  processing,  and  so  on,  we  are  forced  to  terminate  the  experiment  when  the 
subject’s  response  is  too  slow.  If  we  consider  the  response  latency  as  the  reversed  continuous  item  score 
itself  by  defining  the  time  set  as  the  time  limit  as  zero  and  “zero  second”  as  the  unity,  this  will  be  a 
good  example  of  the  closed/open  response  situation.  The  conditional  probability  of  zg  ,  given  9  ,  is 
zero  for  any  zg  except  for  zg  =  0  .  Thus  the  conditional  distribution  is  continuous  for  0  <  zg  <  1  , 
and  discrete  at  zg  =  0  .  In  a  similar  manner,  the  open/closed  response  situation  is  defined  as  the  one 
in  which  the  conditional  distribution  is  continuous  for  0  <  zg  <  1  and  discrete  at  zg  =  1  . 

Let  Pif(9)  be  the  conditional  probability  with  which  the  subject  obtains  the  item  score  zg  or 
greater,  given  9  .  A  general  mathematical  form  for  P^(9)  in  the  homogeneous  case  of  the  continuous 
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response  model  is  given  by 

(5) 


■ 

■  oo 


with 

(6) 

- 

where  a,;  (>  0)  is  the  item  discrimination  parameter,  &,9  is  the  item  response  difficulty  parameter, 
and  ^0(  )  is  a  specific  continuous  function,  which  characterises  the  model,  and  is  positive  almost 
everywhere.  If,  for  example,  ¥0(t)  is  given  by 

(7)  #0(f)  =  (2*)-J/2e-‘V2  . 


then  formula  (l)  will  provide  us  with  the  normal  ogive  model,  and  if  it  is  given  by 

(8)  ¥„(t)  =  De~Dt[l  +  c~Dt\-3 


then  the  logistic  model  will  be  defined. 

Some  years  ago,  Bimbaum  proposed  the  logistic  model  as  a  good  substitute  for  the  normal  ogive 
model  on  the  dichotomous  response  level  (Birnbaum,  1968)  because  of  the  similarity  of  its  item  charac¬ 
teristic  function  to  the  one  in  the  normal  ogive  model,  and  also  because  of  the  fact  that  in  the  logistic 
model  there  exists  a  simple  sufficient  statistic  for  the  response  pattern  of  binary  item  scores.  It  is 
interesting  to  note  that,  as  we  proceed  from  the  dichotomous  response  level  to  the  graded  response  level 
and,  further,  to  the  continuous  level,  substantial  differences  between  the  two  models  come  up  to  the 
surface. 

The  operating  density  characteristic,  HMf(6)  ,  has  been  defined,  and  it  can  be  written  in  the  form 

(9)  HMt(d)  =  ag*g{ag(8-b,')}{-£rb,,}  0  <  zg  <  1  . 

azg 


Let  PIf{8)  be  the  conditional  probability  of  zg  ,  given  9  .  We  can  write 


(10) 


/: 


H,,(9)  dzg  S  1  -  {PoW  +  Pi(0)}  <  1 


where  Po(^)  and  Pi{9)  indicate  P,,{9)  for  zg  =  0  and  zg  =  1  ,  respectively.  We  can  also  write 
for  the  difficulty  parameter 


(11) 


f  lim  bM„  =  bn  >  —oo 

*„-o 

< 

lim  bMf  =  6j  <  oo  . 

Mg  — l 


It  is  noted  that  in  the  open  response  situation,  Po{6)  =  Pi(9)  =  0  ,  throughout  the  whole  range  of  0  , 
and  an  equality  holds  in  (10)  and  in  each  formula  of  (11).  In  each  of  the  three  situations,  however,  the 
left  hand  side  of  (10)  becomes  less  than  unity,  and  equality  does  not  hold  in  both  formulae  of  (11). 
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Figure  5  illustrates  examples  of  functional  relationships  between  the  continuous  item  score  zg  and 
the  difficulty  parameter  b,}  in  the  closed  response  situation.  The  two  functional  formulae  used  in  those 
examples  are  given  in  the  caption  of  the  figure.  In  the  closed/open  response  situation,  those  curves 
approach  positive  infinity  as  zg  tends  to  unity;  in  the  open/closed  response  situation,  they  approach 
negative  infinity  as  zg  tends  to  zero;  and  in  the  open  response  situation,  both  of  these  asymptotic 
characteristics  must  be  true. 

Figure  6  illustrates  the  operating  density  characteristic,  HXg(9),  in  the  normal  ogive  model  and  in 
the  logistic  model  for  the  five  selected  values  of  z„  in  the  closed  response  situation,  using  the  linear 
difficulty  parameter  function  illustrated  in  Figure  5.  As  you  can  see  in  this  figure,  for  each  and  every 
item  score  9  for  0  <  zg  <  1  ,  the  operating  density  characteristic  has  a  unique  local  maximum,  and 
the  configurations  of  those  curves  in  the  two  separate  models  are  similar. 

So  far  I  have  attempted  to  present  a  rough  outline  of  latent  trait  theory,  selecting  several  representa¬ 
tive  situations  and  models,  among  others.  An  important  basic  characteristic  of  this  very  comprehensive 
theory  is  that  it  is  a  probabilistic  model,  not  a  deterministic  model.  Estimation  of  the  operating  char¬ 
acteristics,  or  of  the  operating  density  characteristics,  of  the  item  response  is,  therefore,  one  of  the  most 
important  objectives  in  the  methodology  of  latent  traity  theory.  There  have  been  many  methods  and 
computer  programs  developed  by  researchers  in  this  area.  Those  methods  can  be  categorised  into  two 
categories,  i.e.  1)  the  parametric  method,  and  2)  the  nonparametric  method.  In  the  former,  we  assume 
a  certain  mathematical  model  for  the  operating  characteristic,  and  the  estimation  is  reduced  to  that  of 
the  item  parameters.  In  the  latter,  we  attempt  to  approach  the  operating  characteristic  directly,  avoid¬ 
ing  assumptions  as  much  as  possible.  I  have  developed  several  methods  and  approaches  in  the  past  eight 
years,  which  are  categorized  into  the  nonparametric  method,  in  the  multiyear  research  projects  sup¬ 
ported  by  the  Office  of  Naval  Research  (cf.  Samejima,  1981).  These  methods  and  approaches,  which  are 
listed  below,  are  basically  for  the  discrete  item  responses  in  the  unidimensional  latent  space,  although 
they  can  be  applied  for  the  continuous  item  responses. 

(1)  Approaches 

(i)  Bivariate  P.D.F.  Approach 

(ii)  Histogram  Ratio  Approach 

(iii)  Curve  Fitting  Approach 

(iv)  Conditional  P.D.F.  Approach 

(a)  Simple  Sum  Procedure 

(b)  Weighted  Sum  Procedure 

(c)  Proportioned  Sum  Procedure 


(2)  Methods 

(i)  Pearson  System  Method 

(ii)  Two-Parameter  Beta  Method 

(iii)  Normal  Approach  Method 

Out  of  those  combinations  of  an  approach  and  a  method,  Simple  Sum  Procedure  of  the  Conditional 
P.D.F.  Approach  which  is  combined  with  the  Normal  Approach  Method  has  been  used  most  frequently 
for  analysing  empirical  data.  We  assume  that  we  have  a  set  of  n  items  whose  operating  characteristics 
are  known,  which  is  called  Old  Test  following  the  terminology  in  mental  measurement.  For  convenience, 
let  us  assume  that  each  item  g  of  the  Old  Test  belongs  to  the  graded  response  level.  Let  Ix,(9) 
denote  the  item  response  information  function,  which  is  defined  by 

(12)  Ix,{9)  =  -^\og  Px,{9)  . 


The  item  information  function,  Ig(9)  ,  is  given  as  the  conditional  expectation  of  the  item  response 
information  function,  given  9  ,  i.e., 


18 


* 1=0 


The  response  pattern,  V  ,  of  those  n  items  of  the  Old  Test  can  be  written  as 


V  -  (  X  l  ,  X2  ,  •  ■  •  1  Xg ,  •  •  •  ,  Xn  ) 


By  virtue  of  the  conditional  independence  of  the  item  score  distributions  (Lord  and  Novick,  1968),  the 
operating  characteristic,  Py  (9)  ,  of  the  response  pattern  V  is  given  by 


m*)=  n  p*,(s)  ■ 


This  operating  characteristic,  FV(0)  ,  is  also  the  likelihood  function,  Ly(9)  ,  for  estimating  the 
individual  parameter  9,  for  individual  a  from  his  response  pattern,  when  the  item  parameters  are 
known.  We  can  write  for  the  response  pattern  information  function,  Iy  (0)  , 


Iv(9)  =  ~^hgPy(9)  =  £  Ix,(9)  , 


and  the  test  information  function  1(9)  is  defined  as  the  conditional  expectation  of  the  response  pattern 
information  function,  given  the  latent  trait  9  .  Thus  we  have 


1(9)  =^Iv(9)Py(9)  =  ^^(9)  . 


Figure  5  illustrates  the  square  root  of  the  test  information  functions  of  the  Level  11  Vocabulary 
Subtest  of  the  Iowa  Tests  of  Basic  Skills.  In  order  to  make  the  amount  of  information  equal  for  the 
interval  of  latent  trait  of  interest,  9  is  transformed  to  r  by 


cf1  /  (i(t)Y^dt+c0 , 

J -oo 


where  Co  is  an  arbitrary  constant  for  adjusting  the  origin  of  r  ,  and  Ci  is  another  arbitrary  constant 
which  equals  the  square  root  of  the  constant  test  information  function,  /*(r)  ,  of  r  . 

Using  the  symptotic  property  of  the  maxumum  likelihood  estimate,  which  distributes  normally  with 
the  true  parameter  9  and  the  inverse  of  the  square  root  of  the  test  information  as  its  two  parameters, 
as  an  approximation  to  the  conditional  distribution  of  ?  ,  given  r  ,  we  obtain  for  the  two  conditional 
moments 

(19)  £(r  |  ?)  =  ?  +  Cf 2 ^logi7*(f) 


(20)  Var.(r  |  f)  =  C;  2[l  +  C,  2-^  log  <?*(?)]  . 

In  the  Simple  Sum  Procedure  of  the  Conditional  P.D.F.  Approach,  the  estimate  of  the  operating  char¬ 
acteristic,  Pfc*(0)  ,  of  the  discrete  item  response  kh  to  the  “unknown”  item  h  is  given  by 


P"A0)  =  =  £  Hr  I  ?.)[£*('  I M]-1 


where  N  is  the  number  of  individuals  in  our  sample  and  f>(r  |  t,)  is  the  estimated  conditional  density  of 
r  ,  given  the  maximum  likelihood  estimate  r,  of  the  individual  s  .  In  the  Normal  Approach  Method, 
this  conditional  density  is  approximated  by  the  normal  density  using  the  two  estimated  parameters 
derived  from  (19)  and  (20)  by  setting  f  =  f,  . 

Those  methods  and  approaches  for  estimating  the  operating  characteristics  of  discrete  item  responses 
can  effectively  be  applied  for  the  on-line  item  calibration  in  computerized  adaptive  testing.  The  idea 
of  adaptive  testing  is  to  increase  the  efficiency  of  ability  estimation,  by  presenting  an  optimal  subset  of 
small  number  of  items  selected  from  a  large  item  pool  to  an  individual  subject.  With  the  rapid  progress 
of  computer  technology,  computerized  adaptive  testing  has  become  more  and  more  popular  in  the  past 
decade.  In  adaptive  testing,  it  is  necessary  to  add  new  items  to  the  item  pool  as  we  continue  using 
it  for  years.  We  can  use  these  nonparametric  methods  and  approaches  for  the  on-line  calibration  of 
new  items,  if  we  use  and  appropriate  constant  amount  of  test  information  to  all  the  individuals  as  the 
criterion  for  terminating  the  presentation  of  items  from  the  “old”  item  pool.  Note  that  in  this  example, 
Old  Test  does  not  consist  of  a  single  set  of  items,  and  yet  the  test  information  of  the  Old  Test  is  kept 
constant,  so  that  we  do  not  need  to  transform  (  to  r,  and,  as  the  result,  the  estimation  procedure 
becomes  much  simplified. 

As  we  can  see  from  the  above  examples  of  skeletal  maturity,  reaction  time,  on-line  calibration  in 
computerized  adaptive  testing,  and  so  forth,  latent  trait  theory  has  a  very  broad  area  of  conceivable 
applications.  As  a  researcher  who  has  been  working  on  latent  trait  theory  and  its  methodologies  for 
many  years,  I  wish  to  see  many  more  applications  of  the  theory  in  the  area  of  natural  sciences,  as  well 
as  social  sciences,  in  the  future. 
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APPENDIX  C 


Ten  Items  of  the  Rorschach  Test  and  Their  Scorings  for  the  Purpose 
of  Measuring  the  Intellectural  Capability  Using  Appropriate 
Latent  Trait  Models 


(1)  Failure  to  Justify  in  the  Abscence  of  Imaginal  Aspects 

(Qualities)  (f)  (0  -  1) 

1  -  [(total  number  of  failures  in  justifying  non-imaginal 
responses)  /  (total  number  of  non-imaginal  responses)] 

NOTE:  When  the  total  number  of  non-imaginal  responses  is  4  or 
less,  we  exclude  this  item  for  the  subject  in  question. 

Example  1.1:  Imaginal  response. 

"It  looks  like  a  river  way  down  at  the  bottom  of  a 
valley. " 

Example  1.2:  Non-imaginal  response. 

"It  looks  like  a  bat." 

Example  1.3:  Failure  in  justifying  the  above  non-imaginal 
response. 

Example  1.2  plus:  "It  just  looks  like  it." 

Example  1.4:  Success  in  justifying  the  above  non-imaginal 
response. 

Example  1.2  plus:  "Because  of  the  way  it's  shaped." 
Model:  Discrete/Continuous  Model,  Closed  Response  Situation. 

(2)  Animal  Ratio  (ANRA)  (0  -  1) 

1  -  [(total  number  of  animal  responses  without  humans) 

/  (total  number  of  responses)] 

NOTE:  One  response  can  contain  2  or  more  content  scorings. 

e.g.;  "A  man  walking  a  dog."  (2  content  scorings:  H  and  A) 
Because  this  includes  human,  it  iB  not  considered  as  an 
"animal  response” ,  and  is  counted  as  0  in  the  numerator  of 
the  above  ratio. 

When  the  total  number  of  responses  is  11  or  less,  then  we 
exclude  this  item  for  the  subject  in  question. 

This  rule  applies  for  any  item  using  the  total  number  of 
responses  in  the  denominator.  In  fact,  for  such  a  subject 


Rorschach  diagnosis  is  practically  impossible. 

Approximately  50%  of  adults’  responses  are  animal  responses. 


Example  2.1:  Animal  response  without  humans. 

"This  looks  like  a  bat." 

"A  dog  standing  on  a  river  bank." 

Model:  Discrete/Continuous  Model,  Closed  Response  Situation. 

(3a)  Response  Complexity  (RESCOM)  (0  -  1) 

(1/8)* [(number  of  justification  typeB  +  number  of  imaginal 
aspects)  for  each  response  summed  over  all  responses] 

/  (total  number  of  responses)  if  B  -  8  or  less 

1  if  B  is  more  than  8 

where  B  denotes  the  number  of  elements  in  a  blend  for 
each  response. 

NOTE:  1  response  can  contain  up  to  13  justification  types  and 
imaginal  aspects.  Since  it  is  rare  to  have  more  than  8 
elements  in  a  "blend",  and  in  actual  diagnosis  clinicians 
do  not  usually  make  differences  between  8  and  13,  they  are 
included  in  a  single  item  score,  1  . 

Hereafter,  we  use  "elements  in  a  blend"  for  (number  of 
justification  types  +  number  of  imaginal  aspects) .  When 
abbreviated,  justifications  (alphabetized)  are  followed 
by  imaginal  aspects  (alphabetized) . 
e.g.;  C.C’ .F.HE.HM.V  (color,  achromatic  color,  form, 
human  emotion,  human  movement,  vista) 

Example  3a. 1:  F  (form)  (1  element) 

"It  looks  like  a  landscape,  because  of  the  shape." 

Example  3a. 2:  C’.F  (2  elements) 

(The  above  plus:)  "And  the  white  could  be  snow." 

Example  3a. 3:  C.C’.F  (3  elements) 

(The  above  plus : )  "And  the  red  looks  like  a 
campfire. " 

Example  3a. 4:  C.C'.F.V  (4  elements) 

(The  above  plus:)  "And  in  the  distance  a  man." 

Example  3a. 5:  C. C’.F. HE. V  (5  elements) 

(The  above  plus:)  "He  looks  happy." 


Example  3a. 6:  C.C' .F.HE.HM.V  (6  elements) 


(The  above  plus:)  "He  ia  cooking. 


Example  3a. 7:  Ca  (arbitrary  color)  (1  element) 

"It's  a  map,  because  maps  ar»  always  colored." 

Example  3a. 8:  Cp  (projected  color)  (1  element) 

(To  a  non-colored  blot:)  "It’s  a  blue  bird  because 
it’s  blue." 


Example  3a. 9:  Sh  (shading)  (1  element) 

"It’s  like  fog  that  you  can  almost  see  through." 

Example  3a. 10:  T  (texture)  (1  element) 

"It  looks  like  it  would  feel  soft." 


Example  3a. 11:  AM  (animal  movement)  (1  element) 

"It  looks  like  a  dog  barking." 

Example  3a. 12:  OM  (object  movement)  (1  element) 

"It  looks  like  a  volcano  exploding." 

Example  3a. 13:  Ci  (inappropriate  color)  (1  element) 

"It  looks  like  a  green  sheep." 

Model:  Discrete/Continuous  Model,  Open/Closed  Response  Situation. 


(3b)  Maximum  Response  Complexity  (MRC)  (0  -  1) 

max{B>  /  8  if  max{B}  “8  or  less 

1  if  max{B>  is  more 

than  8 

Model:  Discrete/Continuous  Model,  Open/Close  Response  Situation. 


(3c)  Proportion  of  Complex  Blends  (PCB)  (0  -  1) 

(number  of  responses  with  4  or  more  elements  in  a  blend) 

/  (number  of  total  responses) 

NOTE:  3  or  less  elements  in  a  blend  cannot  be  considered  as  many, 
so  we  take  4  or  more  to  indicate  "many"  for  each  response. 

Model:  Discrete/Continuous  Model,  Closed  Response  Situation. 


(4)  Corrected  Total  Number  of  Responses  (CRES)  (0  -  1) 

(total  number  of  responses)  /  50  if  R  ■  50  or  less 

1  if  R  is  more  than  50 


Model:  Discrete/Continuous  Model,  Open/Closed  Response  Situation. 
Ii  the  total  number  of  responses  are,  say,  50  or  99, 
the  difference  is  not  counted  in  the  diagnosis  situation. 

(5)  Mean  Human  Articulation  (MEHA)  (0  -  1) 

Take  the  mean  of  the  distribution  of  HA1,  HA2,  HA3  and  HA4, 
giving  0,  1/3,  2/3  and  1  for  the  separate  scores.  There  is 
some  doubt  that  HA4  is  not  exactly  ordered,  but  it  has  been 
decided  to  take  the  present  policy  at  this  stage,  and  we  will 
come  back  to  this  point  later  in  the  analysis. 

Example  5.1:  HA1.  "It  looks  like  a  person.” 

Example  5.2:  HA2.  "It  looks  like  a  big  person.” 

Example  5.3:  HA3.  "It  looks  like  a  big  man." 

Example  5.4:  HA4.  "It  looks  like  a  policeman." 

Model:  Oiscrete/Continuous  Model,  Closed  Response  Situation. 

(6)  Mean  Cognitive  Complexity  (MECOG)  (0  -  1) 

We  have  discussed  up  to  the  point  to  decide  that  the  categories 
should  be  ordered  as:  1)  simple,  2)  diffuse,  3)  articulated  + 
arbitrary  and  4)  integrated. 

Take  the  mean  of  the  distribution  of  the  above  4  categories, 
giving  0,  1/3,  2/3  and  1  for  the  separate  scores  for  the  above  4 
categories.  There  is  some  doubt  about  this  measure,  for  one  of 
the  clinicians  says  that  she  does  not  take  the  frequencies  of  1) 
and  2)  into  consideration  when  she  diagnoses.  We  will  come  back 
to  this  question  of  the  adequacy  of  this  item  later  in  the 
analysis . 

Example  6.1:  Simple  cognition. 

"It  looks  like  a  bat." 

Example  6.2:  Diffuse  cognition. 

"It  looks  like  a  cloud." 

Example  6.3:  Articulated  cognition. 

"It  looks  like  a  chair  with  a  back  and  four  legs." 

Example  6.4:  Arbitrary  cognition. 

"It  looks  like  a  spider  wearing  a  hair  ribbon." 


Example  6.5:  Integrated  cognition. 

"It  looks  like  two  people  dancing  together." 

Model:  Discrete/Ccntinaoufl ,  Closed  Response  Situation. 

(7)  Proportion  of  Pure  Form  Justifications  That  Are  Socially 

Appropriate  (F+%)  (0  -  1) 

(number  of  F+  justifications)  /  (total  number  of  F 
justifications) 

NOTE:  Pure  form  justifications  (F)  includes  both  F+  and  F-  , 
i.e.,  pure  form  justifications  that  are  socially 
inappropriate.  F  is  the  most  common  justification,  and 
approximately  50%  of  the  adult  subjects'  justifications 
belong  to  this  category. 

Categorization  of  F  responses  into  F+  and  F-  categories  is 
made  following  the  list  in  Beck,  S.  J. ,  et  al,  1961. 

If  the  total  number  of  F  justifications  is  4  or  less,  the 
item  is  excluded  for  that  subject.  Typically,  for  one 
subject  there  are  10  to  12  F  justifications. 

Example  7.1:  F+  justification. 

(To  the  whole  area  of  the  inkblot  of  card  1  which 
really  looks  like  a  bird:) 

"Bird,  because  it  is  Bhaped  like  a  bird." 

Example  7.2:  F-  justification. 

(To  the  same  area  indicated  above:) 

"Worm,  because  it  iB  shaped  like  a  worm." 

Model:  DiBcrete/Continuous  Model,  Closed  Response  Situation. 

(8)  Proportion  of  Whole  Responses  (W%)  (0  -  1) 

(number  of  responses  using  the  whole  inkblot) 

/  (total  number  of  responses) 

NOTE:  Approximately  20%  of  responses  are  whole  responses  in  the 
adult  population. 

There  is  some  concern  that  the  relationship  of  this  item 
with  the  intellectual  capability  may  not  be  monotonic .  We 
will  come  back  to  this  question  later  in  the  analysis. 

Model:  Discrete/Continuous,  Closed  Response  Situation. 


(9)  Integrated  White  Space  (IWS)  (0,1,2,...) 


(number  of  responses  with  integrated  white  space) 

NOTE:  Usually,  black  and/or  colored  parts  of  the  picture  (inkblot) 
are  used  as  figure  against  a  white  ground.  But  sometimes 
white  spaces  are  integrated  in  the  figure. 

6  or  more  such  responses  are  rare,  and  usually  in  diagnosis 
they  are  considered  in  a  single  category,  i.e.,  "many". 

Example  9.1:  /face 

eye 

Model:  Graded  Response  Model. 

(10)  Range  of  Content  (CONTRA)  (0, 1 ,2, . . . , 14 .many) 

(number  of  different  content  categories  scored) 

NOTE:  This  starts  from  1  and  goes  up  to  37.  15  or  more  are  rare. 

They  may  be  categorized  as  "many" . 

These  categories  include:  1)  animal,  2)  special  case  of 
animal  (e.g.;  dragon),  3)  abstraction  (e.g.;  conflict), 

4)  animal  detail  (e.g.;  dog's  paw),  5)  special  case  of 
animal  detail  (e.g.;  phoenix’  wing),  6)  animal  face  ,  7) 
special  case  of  animal  face  (e.g.;  unicorn's  face),  8) 
anatomy,  bony,  9)  anatomy,  Boft  (lung),  10)  anatomy,  sex, 
clothing  (e.g.;  bow  tie),  16)  cloud,  smoke,  vapor,  17)  death 
(e.g.;  tombstone),  18)  emblems  (e.g.;  flag),  19)  food,  20) 
fire  (including  explosion),  21)  geography  (e.g.;  Italy),  22) 
human,  23)  special  caBe  of  human  (e.g.;  demon),  24)  human 
detail  (excluding  human  head  or  face,  e.g.;  finger),  25) 
special  case  of  human  detail  (e.g.;  mermaid’s  tail),  26) 
human  face,  27)  special  caBe  of  human  face  (ghost's  face), 
28)  household  (e.g.;  drawer  cabinet),  29)  implements  (e.g.; 
hammer),  30)  landscape,  31)  music  (e.g.;  violin),  32) 
religion  (e.g.;  cross),  33)  schemata  (e.g.;  map),  34) 
scientific  implements  (e.g.,  microscope),  35)  toy  (e.g.; 
bicycle),  36)  travel  (e.g.,  airplane),  37)  weapons  (e.g.; 
missile) .  (cf.  Burstein  and  Loucks,  1988) 

Model:  Graded  Response  Model. 

REMARKS:  (3a),  (3b)  and  (3c)  are  included  because  they  are  all  used  in 


actual  clinical  diagnosia  for  intellectual  ability.  In  the 
process  of  analysis,  however,  they  may  be  combined  into  one 
item . 

Motivational  articulation  was  also  considered  as  an  item,  but 
excluded  later,  since  we  discovered  that  there  is  no 
systematic  tendency  to  believe  that  MAI,  MA2,  MA3  and  MA4  are 
ordered  to  reflect  intellectual  capacity. 

(1)  Beck,  S.  J.,  Beck,  A.  G.,  Levitt.  E.  E.  and  Molish,  H.  B. 

Rorschach  Test,  Vol.  1:  Basic  Process.  New  York:  Grune 
and  Stratton,  1961. 

(2)  Burstein,  A.  G.  and  Loucks,  S.  Rorschach’s  Test:  Scoring  and 

Interpretation.  New  York:  Hemisphere.  (To  be  published  in 
August ,  1988 . ) 
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